Data Types and Variables in C language |
Data Types and Variables
In this section we
will cover the concept of data types and how we can use those to create
variables, we will also do a basic output of variables to show how to do more
advanced console output.
Data Types and Variables
What is a Data Type?
Simply put, a data type is
a representation of some sort of real-world value. Now, as you probably have
picked up, I'm not going to be keeping it simple, the whole point in learning C
is to understand the way your computer works at a deeper level (and C is an
awesome language, can't forget that). Let's talk about memory.
Type
|
Size (bytes)
|
Range
|
Int or signed int
|
2
|
-32,768 to 32767
|
Unsigned int
|
2
|
0 to 65535
|
Short int or signed
short int
|
1
|
-128 to 127
|
Unsigned short int
|
1
|
0 to 255
|
Long int or signed
long int
|
4
|
-2,147,483,648 to
2,147,483,647
|
Unsigned long int
|
4
|
0 to 4,294,967,295
|
As we
covered earlier, when your program is loaded the kernel allocates memory for
your program to run, but how does it do that? Well, it's dependent on how much
data you've told the kernel to allocate. I'll give you an example, there is a
data type called int (as we talked about in the previous lesson), an int by
default is specified to take up 16 bits, or 2 bytes of memory (usually).
According to the rules of binary, this gives your int a minimum value of
-32,768 and a max value of 32,767. If the integer is unsigned (the number can't
be negative) that range changes from 0 - 65,535. This is because of the way
signs work in binary, essentially there has to be a bit called a "sign bit",
that tells the OS to treat the int as a signed int, meaning the number can be
positive or negative, this, of course, limits your max number. Now all of that
may seem a little... well...pointless to know right? Well no, because it
explains how a datatype gets stored in memory. Essentially there is no such
thing as a "data type" for all intents and purposes, all that a
"data type" is in the first place, is a way of telling the OS how to
layout the memory and how to treat that section of memory. Let me do one
illustration to drive the point home; examine this section of binary I've
written, in this example, we're going to pretend an int is 4 bytes.
0000 0000
0000 0000
0000 0000
0000 0000
In this example, we have an integer laid out in
memory, and its value is currently 0 because all bits are off. When your
program runs and it wants to access this variable in memory, it has to have
some way to find it, so it uses something called a memory address to find it.
We'll talk much more about memory addresses later on, but for now, know it's
literally no different than using a house number to find a house, the OS uses a
memory address to find a variable. Now, when it finds this memory address, the
memory address only applies to the first byte, so how does it know to use the the next 3 bytes as part of the operation? Well, the data type. We said earlier
that ints, in this case, will be 4 bytes, so the OS/Compiler knows that the next
3 bytes after the first byte are part of the same variable. If we wanted to
increment the value of this integer by 1, the program would run the operation,
go to the memory address, read the current value, and find the bit that it
needs to change in order to up the value by 1. The int in memory would then
look like this:
0000 0000
0000 0000
0000 0000
0000 0001
Notice
that the bottom right bit got changed, that's because like Japanese, binary
gets read right to left, so it changes the "last" bit on the
"last" byte, it only appears this way because of how I've drawn it
out, in actuality, it would not look this way.
Mathematical data types
I know I've been going into
a great deal of detail, but man if you think I'm going to go through
every one of these you are out of your mind. You can derive what each of these
do base off 3 simple facts: short is a shorter version of an int, an int is
the typical 16-bit int we talked about earlier, and along is a longer version
of an int, and we already talked about signed and unsigned so you should be
able to figure out what each of these is based on that
short
short int
signed short
signed short int
unsigned short
unsigned short int
int
signed
signed int
unsigned
unsigned int
long
long int
signed long
signed long int
unsigned long
unsigned long int
long long
long long int
signed long long
signed long long int
unsigned long long
unsigned long long int
Now
theres more than that, but we'll cover them as they are needed, I just wanted
to show you there are lots of different mathematical data types.
Characters and...Wait, that's all there is...?
So...if you are coming from another language, this
may shock and terrify you. C in and of itself does not have strings. Look all
around you, there are characters, characters EVERYWHERE, but not a string in
sight. Oh, woe is me, my eyes doth bleed at the sight of not having thy strings!
Yeah...well, get over it. Now, if you are really desperate for your strings I
won't hide this fact from you, this is a C++ compiler, so you could include the
string library from C++ and use strings. That said, I HEAVILY advise you to not
do that, you will learn much more by using characters first.
What is a String in the First Place?
So in older languages,
there actually were "strings", but not as we know them today.
See, strings get their name from the way that they are created. What a string
is, is it is a collection of characters in memory that is "strung"
together, hence the name "string". So, simply put, a string is simply
a string of characters put together. Now, you would think that it would be
relatively easy to create a string data type right? Well...sort of, but not
really. We'll talk more about Arrays later, but for now know that an array
is a collection of data of the same type that gets stored together in memory,
it's sort of like having a collection of data bound together. So, a string in C
is simply an array of characters. That means C has one disadvantage other
languages don't have and that is string processing, or doing operations on a
string. For example, in C# if I want to make a string uppercase, I simply type
"string.ToUpper()", and that will make the string uppercase. In C, I
would have to iterate through the whole array, figure out what character is
what, and change it to it's capitalized ASCII counter-part. For this reason, there are entire sections of C programming books that deal strictly with string
processing. So that sounds pretty awful, but honestly, it's not a huge issue. C
is not meant to be used for string processing, it's a language for interfacing
with the OS, pushing data around very quickly, writing games, drivers, etc.
Typically if you really just absolutely had to have a string for some reason
you'd just use C++, but there are advantages to having arrays of characters
instead of strings, more on that later. That said, for anything relating to
text...you just need to know the data type
char
Variables
So, I honestly can't remember if I've used the term "variable"
yet, but now
is the time to cover it. A variable is an extremely simple concept to
understand, so wipe the sweat off of your face, back, nether regions, etc. and
relax. All a variable is is it is a named section of memory. Now, that's not
100% accurate, if you want to be really technical, it's more like a named
memory address, where anytime you use the variable in the program, it is sort
of an alias to the memory address, but then that's not totally true either
because that's more what a pointer is...but this seems to be a no-win the situation, so if you can think of a better way of describing what a variable
is, go make your own website/tutorials. Anyways...that's it, I have no more
super in-depth jargon to tell you so...hooray, a short section!
Programming with Data Types and Variables
Oh boy! After all this time
we finally get to write some code. Alright, I promised I would
help you set up your build system for each program for the first few exercises
and I am a man of my word, so here we go. Go into your folder that you saved your
Lesson 1 code to; by the way, I'm sorry about this weird numbering system that
we've got going on for each lesson, Lecture 1 is used by lecture 2 and now
lecture 2 is going to be explained in lecture 4...I know, I'm the worst, but I
think you'll be fine. Anyways, back to what I was saying earlier, when you are
in your folder that holds the Lesson1 folder, create a new folder and call it
Lesson2. Take all the folders from Lesson1 and copy them into Lesson2. Go into
your code folder and open the build.bat file from Lesson1. Change this:
@echo off
mkdir ..\build
pushd ..\build
cl -Zi c:\CWebsiteTutorials\Lesson1\code\main.cpp user32.lib gdi32.lib
popd
To this:
@echo off
mkdir ..\build
pushd ..\build
cl -Zi c:\CWebsiteTutorials\Lesson2\code\main.cpp user32.lib gdi32.lib
popd
Alright, that's it. All you will need to do for each lesson is to make a new folder, copy the old build and code folders into it, and change the lesson number in the build file. I told you once you get set up it's super easy, hopefully, that will help you trust me whenever I say things suck now but they get better!.
Creating Variables
Alright, now I want you to start using your text editor you downloaded. I will
not explain how these text editors work, it would take WAY too much
time to do so, so if you are confused, search Google.
As an aside, I want to mention this, when a programmer tells you to Google it, they aren't being condescended, what we mean is Google it. You can almost always find somebody who has run into the same error or weird bug you are having (especially during beginner phases of learning programming). Programmers get really really mad when you ask easy questions that have hundreds of posts online on how to fix them so always use Google. I use Google probably a hundred times a day when I write code, no joke. I use Google as I'm writing these tutorials, nobody for sure knows every single fact about programming, and nobody expects you to, but they do expect you to be able to work through problems and only ask when you absolutely cannot find the answer you need. Alright, moving on
As an aside, I want to mention this, when a programmer tells you to Google it, they aren't being condescended, what we mean is Google it. You can almost always find somebody who has run into the same error or weird bug you are having (especially during beginner phases of learning programming). Programmers get really really mad when you ask easy questions that have hundreds of posts online on how to fix them so always use Google. I use Google probably a hundred times a day when I write code, no joke. I use Google as I'm writing these tutorials, nobody for sure knows every single fact about programming, and nobody expects you to, but they do expect you to be able to work through problems and only ask when you absolutely cannot find the answer you need. Alright, moving on
Creating and Outputting Variables
So, with your text editor
opened up, let's write some code. At this point for code, I'll start using images
for two reasons:
·
You can't copy and paste the
text.
·
You can see my syntax
highlighting and everything.
Type the following code into your text editor and save it as main.cpp in your lesson 2
code folder. When you run this program you should get 2 as your output.
Now, you'll see that creating our variables is a very easy process, we simply
declare it's data type, it's named, and it's value. Now, something that
sometimes throws people, equals in programming means "take the value on
the right-hand side, and set it equal to the value on the left-hand side",
this is called an "assignment operator". Notice how each assignment
ends with a semi-colon, it's going to be a little hard at first to remember
where semi-colons go, but you'll get the hang of it soon enough, it's actually
quite easy once you really get right down to it.
The; second thing we need to cover is how to actually output variables to the console window, which sounds a bit pointless but it is actually, a very important thing to have down for when you start writing applications for yourself. Notice in the printf function we passed it a string "%i". Whenever you append a % before certain characters, it is called a "Conversion Specifier". Essentially, printf only outputs strings, it cannot output an integer, so when we place %i in the middle of a string, we are telling it we want to convert an integer to a string and place it in that spot. This part get's a little weird so bare with me. As mentioned previously, printf takes 1 argument, which is a string to output. What I didn't tell you is it has more than 1 argument it can take. In a function, whenever you put a comma in between something, it means you want to pass in another argument (again, this will all make perfect sense when we start actually making functions for ourselves). So printf is a little unique in that it has an indefinite amount of arguments it can take, whereas most functions take a very specific number of arguments, for example if I had a function called "CreateWindow()", it may take an argument for the height, width, title of the window, and whether or not to allow it to be maximized and minimized; this is the general way functions are written. This is different with printf because of the nature of I/O, the amount of variables somebody may need to output could be 0, 1, or 10,000, we can never be sure, so the compiler has to be given very unique instructions for this function. Essentially what happens at compile time is the compiler looks and it finds how many of those conversion specifiers are in the string, it then says "okay, I'm going to take each argument passed after this string, convert the data to a string, and plug them into each spot in the string in order from first to last, and if there are any issues doing this, I will throw an error". Sounds complicated, but again...it's complicated because I'm explaining it in great detail, all you really need to know is you can type % and a certain letter after it to represent that it should plug a variable into that spot, then pass in the same number of variables as arguments to the function. Here is a list of all the conversion specifiers:
The; second thing we need to cover is how to actually output variables to the console window, which sounds a bit pointless but it is actually, a very important thing to have down for when you start writing applications for yourself. Notice in the printf function we passed it a string "%i". Whenever you append a % before certain characters, it is called a "Conversion Specifier". Essentially, printf only outputs strings, it cannot output an integer, so when we place %i in the middle of a string, we are telling it we want to convert an integer to a string and place it in that spot. This part get's a little weird so bare with me. As mentioned previously, printf takes 1 argument, which is a string to output. What I didn't tell you is it has more than 1 argument it can take. In a function, whenever you put a comma in between something, it means you want to pass in another argument (again, this will all make perfect sense when we start actually making functions for ourselves). So printf is a little unique in that it has an indefinite amount of arguments it can take, whereas most functions take a very specific number of arguments, for example if I had a function called "CreateWindow()", it may take an argument for the height, width, title of the window, and whether or not to allow it to be maximized and minimized; this is the general way functions are written. This is different with printf because of the nature of I/O, the amount of variables somebody may need to output could be 0, 1, or 10,000, we can never be sure, so the compiler has to be given very unique instructions for this function. Essentially what happens at compile time is the compiler looks and it finds how many of those conversion specifiers are in the string, it then says "okay, I'm going to take each argument passed after this string, convert the data to a string, and plug them into each spot in the string in order from first to last, and if there are any issues doing this, I will throw an error". Sounds complicated, but again...it's complicated because I'm explaining it in great detail, all you really need to know is you can type % and a certain letter after it to represent that it should plug a variable into that spot, then pass in the same number of variables as arguments to the function. Here is a list of all the conversion specifiers:
%d and %i
int (signed decimal integer)
%u
unsigned decimal integer
%f
floating point values (fixed notation) - float, double
%e
floating point values (exponential notation)
%s
string
%c
character
Okay, let's modify the program a little bit to illustrate in greater detail how this works. Copy the below code: When you run this program, you should get the following output
The value of x is:2, the value of y is:3
x + y = 5
There are two new things to talk about, first off, in the first printf you'll notice we typed "\n" at the end of the printf string. This is called an escape sequence. It's very similar to how conversion specifiers work, but they are a bit different. Whenever the C compiler sees a backslash in a string, it looks at the next character, that next character will determine the the behavior of the escape sequence. In this case, \n means start a new line. That may seem stupid, until you realize that the compiler ignores all white space and so there's no other way of telling it to start a new line, in fact, when your program is fed through the compiler, it gets rid of all the white space, that's why we have to use stuff like semicolons, to identify when we are done with a statement so the compiler knows, and this is how it works in every single programming language pretty much with some exceptions, heck most languages even use escape sequences, so definitely remember at minimum \n. Here is a list of some of the escape sequences(fromwikipedia):
\a Alarm (Beep, Bell)
\b Backspace
\f Formfeed
\n Newline (Line Feed); see notes below
\r Carriage Return
\t Horizontal Tab
\v Vertical Tab
\\ Backslash
\' Single quotation mark
\" Double quotation mark
\? Question mark
\nnn any The character whose numerical value is given by nnn interpreted as an octal number
\xhh any The character whose numerical value is given by hh interpreted as a hexadecimal number
The last thing I want to point out is you'll notice we did some arithmetic operations (addition) in the second printf, I wanted to show you this to show how this is a thing you can do...not a lot to explain really, just a neat the little feature you will for sure use in one way or another.
Just for one final fun thing, I want you to write out this code and run it whenever you want to annoy somebody, you won't understand some of the code, and that's fine. I just want to show the joy of programming. Make sure people around you aren't sleeping: Build it, run it, hate your life, and close the console window because you will have no other choice.
Conclusion
In conclusion, we covered data types, what they are, how they
get stored in memory, how to create variables with data types, assign them
values, output those values with printf, and escape sequences, some of which
were extremely annoying. Not bad for 1 lesson. The next lesson is going to be a
short one (I almost shouldn't make it its own lesson), we will be covering
comments, which is literally the easiest concept to understand in programming.
Again, go grab another coffee, take a break, do whatever and come back ready to
learn!
Lesson No: 03 Studying Hello World
Excellent article about data types. Keep sharing
ReplyDeletePost a Comment