Skip to content

C Programming Language Fundamentals

Tokens and Basic Elements in C

Token

  • A token is the smallest individual unit or element in a C program that has meaning to the compiler
  • In C, there are five types of tokens:
  • Keywords
  • Identifiers
  • Constants
  • String Literals
  • Operators and Special Symbols
  • The C compiler identifies these tokens during lexical analysis

Tokenizing

  • Tokenizing (or lexical analysis) is the first phase of C compilation
  • The C preprocessor and lexical analyzer process the source code to identify tokens
  • Comments and whitespace are typically removed during this phase
  • Example: int main(void) is broken into tokens: int (keyword), main (identifier), ( (delimiter), void (keyword), ) (delimiter)

Keyword

  • C has 32 reserved keywords that cannot be used as identifiers
  • Common C keywords include:
  • Data types: int, char, float, double, void
  • Control flow: if, else, for, while, do, switch, case, break, continue, return
  • Storage classes: auto, static, extern, register
  • Type qualifiers: const, volatile

Identifier

  • Names for variables, functions, structures, unions, and labels in C
  • Rules in C:
  • Must start with a letter or underscore
  • Can contain letters, digits, and underscores
  • Case sensitive (countCount)
  • Cannot use C keywords
  • Examples:
  • Valid: main, _temp, counter1, studentMarks
  • Invalid: 2count, while, my-var

Operator

  • C operators by category:
  • Arithmetic: +, -, *, /, % (integer division and modulo in C)
  • Increment/Decrement: ++, -- (unique to C and C-like languages)
  • Relational: >, <, >=, <=, ==, !=
  • Logical: &&, ||, !
  • Bitwise: &, |, ^, ~, <<, >>
  • Assignment: =, +=, -=, *=, /=, %=, <<=, >>=, &=, ^=, |=
  • Special: sizeof, & (address-of), * (pointer), . (member), -> (pointer member)

Constant

  • C supports several types of constants:
  • Integer constants:
    • Decimal: 123, -456
    • Octal: 0777
    • Hexadecimal: 0xFF, 0xA5
  • Floating-point constants: 3.14, 2.0e-4
  • Character constants: 'A', '\n', '\0'
  • Enumeration constants
  • Defined constants (#define MAX 100)
  • const qualified variables

String Literal

  • In C, string literals are arrays of characters terminated by null character (\0)
  • Enclosed in double quotes: "Hello, World!"
  • C string escape sequences:
  • \n - newline
  • \t - tab
  • \r - carriage return
  • \b - backspace
  • \\ - backslash
  • \" - double quote
  • \0 - null character
  • String concatenation: "Hello " "World" is same as "Hello World"

Delimiter

  • Special symbols in C:
  • Semicolon (;) - statement terminator
  • Parentheses (()) - function calls, expressions, parameter lists
  • Braces ({}) - compound statements, function bodies, structure/union declarations
  • Square brackets ([]) - array subscripts
  • Comma (,) - separator in function arguments and variable declarations
  • Preprocessor symbols (#) - for preprocessor directives
  • Comments:
    • Single line: //
    • Multi-line: /* */