Lesson No-4 Data Types and Variables in C language

Data Types and Variables in C language

Data Types and Variables

In this section we will cover the concept of data types and how we can use those to create variables, we will also do a basic output of variables to show how to do more advanced console output.

Data Types and Variables

What is a Data Type?

Simply put, a data type is a representation of some sort of real-world value. Now, as you probably have picked up, I'm not going to be keeping it simple, the whole point in learning C is to understand the way your computer works at a deeper level (and C is an awesome language, can't forget that). Let's talk about memory.

Type	Size (bytes)	Range
Int or signed int	2	-32,768 to 32767
Unsigned int	2	0 to 65535
Short int or signed short int	1	-128 to 127
Unsigned short int	1	0 to 255
Long int or signed long int	4	-2,147,483,648 to 2,147,483,647
Unsigned long int	4	0 to 4,294,967,295

As we covered earlier, when your program is loaded the kernel allocates memory for your program to run, but how does it do that? Well, it's dependent on how much data you've told the kernel to allocate. I'll give you an example, there is a data type called int (as we talked about in the previous lesson), an int by default is specified to take up 16 bits, or 2 bytes of memory (usually). According to the rules of binary, this gives your int a minimum value of -32,768 and a max value of 32,767. If the integer is unsigned (the number can't be negative) that range changes from 0 - 65,535. This is because of the way signs work in binary, essentially there has to be a bit called a "sign bit", that tells the OS to treat the int as a signed int, meaning the number can be positive or negative, this, of course, limits your max number. Now all of that may seem a little... well...pointless to know right? Well no, because it explains how a datatype gets stored in memory. Essentially there is no such thing as a "data type" for all intents and purposes, all that a "data type" is in the first place, is a way of telling the OS how to layout the memory and how to treat that section of memory. Let me do one illustration to drive the point home; examine this section of binary I've written, in this example, we're going to pretend an int is 4 bytes.

0000 0000

In this example, we have an integer laid out in memory, and its value is currently 0 because all bits are off. When your program runs and it wants to access this variable in memory, it has to have some way to find it, so it uses something called a memory address to find it. We'll talk much more about memory addresses later on, but for now, know it's literally no different than using a house number to find a house, the OS uses a memory address to find a variable. Now, when it finds this memory address, the memory address only applies to the first byte, so how does it know to use the the next 3 bytes as part of the operation? Well, the data type. We said earlier that ints, in this case, will be 4 bytes, so the OS/Compiler knows that the next 3 bytes after the first byte are part of the same variable. If we wanted to increment the value of this integer by 1, the program would run the operation, go to the memory address, read the current value, and find the bit that it needs to change in order to up the value by 1. The int in memory would then look like this:

0000 0000

0000 0001

Notice that the bottom right bit got changed, that's because like Japanese, binary gets read right to left, so it changes the "last" bit on the "last" byte, it only appears this way because of how I've drawn it out, in actuality, it would not look this way.

Mathematical data types

I know I've been going into a great deal of detail, but man if you think I'm going to go through every one of these you are out of your mind. You can derive what each of these do base off 3 simple facts: short is a shorter version of an int, an int is the typical 16-bit int we talked about earlier, and along is a longer version of an int, and we already talked about signed and unsigned so you should be able to figure out what each of these is based on that

short

short int

signed short

signed short int

unsigned short

unsigned short int

int

signed

signed int

unsigned

unsigned int

long

long int

signed long

signed long int

unsigned long

unsigned long int

long long

long long int

signed long long

signed long long int

unsigned long long

unsigned long long int

Now theres more than that, but we'll cover them as they are needed, I just wanted to show you there are lots of different mathematical data types.

Characters and...Wait, that's all there is...?

So...if you are coming from another language, this may shock and terrify you. C in and of itself does not have strings. Look all around you, there are characters, characters EVERYWHERE, but not a string in sight. Oh, woe is me, my eyes doth bleed at the sight of not having thy strings! Yeah...well, get over it. Now, if you are really desperate for your strings I won't hide this fact from you, this is a C++ compiler, so you could include the string library from C++ and use strings. That said, I HEAVILY advise you to not do that, you will learn much more by using characters first.

What is a String in the First Place?

So in older languages, there actually were "strings", but not as we know them today. See, strings get their name from the way that they are created. What a string is, is it is a collection of characters in memory that is "strung" together, hence the name "string". So, simply put, a string is simply a string of characters put together. Now, you would think that it would be relatively easy to create a string data type right? Well...sort of, but not really. We'll talk more about Arrays later, but for now know that an array is a collection of data of the same type that gets stored together in memory, it's sort of like having a collection of data bound together. So, a string in C is simply an array of characters. That means C has one disadvantage other languages don't have and that is string processing, or doing operations on a string. For example, in C# if I want to make a string uppercase, I simply type "string.ToUpper()", and that will make the string uppercase. In C, I would have to iterate through the whole array, figure out what character is what, and change it to it's capitalized ASCII counter-part. For this reason, there are entire sections of C programming books that deal strictly with string processing. So that sounds pretty awful, but honestly, it's not a huge issue. C is not meant to be used for string processing, it's a language for interfacing with the OS, pushing data around very quickly, writing games, drivers, etc. Typically if you really just absolutely had to have a string for some reason you'd just use C++, but there are advantages to having arrays of characters instead of strings, more on that later. That said, for anything relating to text...you just need to know the data type

char

Variables

So, I honestly can't remember if I've used the term "variable" yet, but now is the time to cover it. A variable is an extremely simple concept to understand, so wipe the sweat off of your face, back, nether regions, etc. and relax. All a variable is is it is a named section of memory. Now, that's not 100% accurate, if you want to be really technical, it's more like a named memory address, where anytime you use the variable in the program, it is sort of an alias to the memory address, but then that's not totally true either because that's more what a pointer is...but this seems to be a no-win the situation, so if you can think of a better way of describing what a variable is, go make your own website/tutorials. Anyways...that's it, I have no more super in-depth jargon to tell you so...hooray, a short section!

Programming with Data Types and Variables

Oh boy! After all this time we finally get to write some code. Alright, I promised I would help you set up your build system for each program for the first few exercises and I am a man of my word, so here we go. Go into your folder that you saved your Lesson 1 code to; by the way, I'm sorry about this weird numbering system that we've got going on for each lesson, Lecture 1 is used by lecture 2 and now lecture 2 is going to be explained in lecture 4...I know, I'm the worst, but I think you'll be fine. Anyways, back to what I was saying earlier, when you are in your folder that holds the Lesson1 folder, create a new folder and call it Lesson2. Take all the folders from Lesson1 and copy them into Lesson2. Go into your code folder and open the build.bat file from Lesson1. Change this:

@echo off

mkdir ..\build

pushd ..\build

cl -Zi c:\CWebsiteTutorials\Lesson1\code\main.cpp user32.lib gdi32.lib

popd

To this:

@echo off

mkdir ..\build

pushd ..\build

cl -Zi c:\CWebsiteTutorials\Lesson2\code\main.cpp user32.lib gdi32.lib

popd

Alright, that's it. All you will need to do for each lesson is to make a new folder, copy the old build and code folders into it, and change the lesson number in the build file. I told you once you get set up it's super easy, hopefully, that will help you trust me whenever I say things suck now but they get better!.

Creating Variables

Alright, now I want you to start using your text editor you downloaded. I will not explain how these text editors work, it would take WAY too much time to do so, so if you are confused, search Google.

As an aside, I want to mention this, when a programmer tells you to Google it, they aren't being condescended, what we mean is Google it. You can almost always find somebody who has run into the same error or weird bug you are having (especially during beginner phases of learning programming). Programmers get really really mad when you ask easy questions that have hundreds of posts online on how to fix them so always use Google. I use Google probably a hundred times a day when I write code, no joke. I use Google as I'm writing these tutorials, nobody for sure knows every single fact about programming, and nobody expects you to, but they do expect you to be able to work through problems and only ask when you absolutely cannot find the answer you need. Alright, moving on

Creating and Outputting Variables

So, with your text editor opened up, let's write some code. At this point for code, I'll start using images for two reasons:

· You can't copy and paste the text.

· You can see my syntax highlighting and everything.

Type the following code into your text editor and save it as main.cpp in your lesson 2 code folder. When you run this program you should get 2 as your output. Now, you'll see that creating our variables is a very easy process, we simply declare it's data type, it's named, and it's value. Now, something that sometimes throws people, equals in programming means "take the value on the right-hand side, and set it equal to the value on the left-hand side", this is called an "assignment operator". Notice how each assignment ends with a semi-colon, it's going to be a little hard at first to remember where semi-colons go, but you'll get the hang of it soon enough, it's actually quite easy once you really get right down to it.

The; second thing we need to cover is how to actually output variables to the console window, which sounds a bit pointless but it is actually, a very important thing to have down for when you start writing applications for yourself. Notice in the printf function we passed it a string "%i". Whenever you append a % before certain characters, it is called a "Conversion Specifier". Essentially, printf only outputs strings, it cannot output an integer, so when we place %i in the middle of a string, we are telling it we want to convert an integer to a string and place it in that spot. This part get's a little weird so bare with me. As mentioned previously, printf takes 1 argument, which is a string to output. What I didn't tell you is it has more than 1 argument it can take. In a function, whenever you put a comma in between something, it means you want to pass in another argument (again, this will all make perfect sense when we start actually making functions for ourselves). So printf is a little unique in that it has an indefinite amount of arguments it can take, whereas most functions take a very specific number of arguments, for example if I had a function called "CreateWindow()", it may take an argument for the height, width, title of the window, and whether or not to allow it to be maximized and minimized; this is the general way functions are written. This is different with printf because of the nature of I/O, the amount of variables somebody may need to output could be 0, 1, or 10,000, we can never be sure, so the compiler has to be given very unique instructions for this function. Essentially what happens at compile time is the compiler looks and it finds how many of those conversion specifiers are in the string, it then says "okay, I'm going to take each argument passed after this string, convert the data to a string, and plug them into each spot in the string in order from first to last, and if there are any issues doing this, I will throw an error". Sounds complicated, but again...it's complicated because I'm explaining it in great detail, all you really need to know is you can type % and a certain letter after it to represent that it should plug a variable into that spot, then pass in the same number of variables as arguments to the function. Here is a list of all the conversion specifiers:

%d and %i int (signed decimal integer)

%u unsigned decimal integer

%f floating point values (fixed notation) - float, double

%e floating point values (exponential notation)

%s string

%c character

Okay, let's modify the program a little bit to illustrate in greater detail how this works. Copy the below code: When you run this program, you should get the following output

The value of x is:2, the value of y is:3

x + y = 5

There are two new things to talk about, first off, in the first printf you'll notice we typed "\n" at the end of the printf string. This is called an escape sequence. It's very similar to how conversion specifiers work, but they are a bit different. Whenever the C compiler sees a backslash in a string, it looks at the next character, that next character will determine the the behavior of the escape sequence. In this case, \n means start a new line. That may seem stupid, until you realize that the compiler ignores all white space and so there's no other way of telling it to start a new line, in fact, when your program is fed through the compiler, it gets rid of all the white space, that's why we have to use stuff like semicolons, to identify when we are done with a statement so the compiler knows, and this is how it works in every single programming language pretty much with some exceptions, heck most languages even use escape sequences, so definitely remember at minimum \n. Here is a list of some of the escape sequences(fromwikipedia):

\a Alarm (Beep, Bell)

\b Backspace

\f Formfeed

\n Newline (Line Feed); see notes below

\r Carriage Return

\t Horizontal Tab

\v Vertical Tab

\\ Backslash

\' Single quotation mark

\" Double quotation mark

\? Question mark

\nnn any The character whose numerical value is given by nnn interpreted as an octal number

\xhh any The character whose numerical value is given by hh interpreted as a hexadecimal number

The last thing I want to point out is you'll notice we did some arithmetic operations (addition) in the second printf, I wanted to show you this to show how this is a thing you can do...not a lot to explain really, just a neat the little feature you will for sure use in one way or another.

Just for one final fun thing, I want you to write out this code and run it whenever you want to annoy somebody, you won't understand some of the code, and that's fine. I just want to show the joy of programming. Make sure people around you aren't sleeping: Build it, run it, hate your life, and close the console window because you will have no other choice.

Conclusion

In conclusion, we covered data types, what they are, how they get stored in memory, how to create variables with data types, assign them values, output those values with printf, and escape sequences, some of which were extremely annoying. Not bad for 1 lesson. The next lesson is going to be a short one (I almost shouldn't make it its own lesson), we will be covering comments, which is literally the easiest concept to understand in programming. Again, go grab another coffee, take a break, do whatever and come back ready to learn!

Lesson No: 03 Studying Hello World

Search This Blog

C and C++ programming Tutorials