In number theory and algebra, the notation and concepts are sometimes developed as follows:
The first concept is divisibility. a | b is a proposition, representing that the integer a divides the integer b. In C you would usually write this test as (a % b == 0), but for purposes of math, you'd define an study | before talking about modulo probably. For instance, you can see that | defines a partial order on the integers. So, if a | b and b | c, then a | c. Also a | a, and if both a | b and b | a, then a = b.
The second concept is congruence modulo a number. In math it's usually notated:
a = b mod c
except instead of using =, you would use 3 lines, for instance, unicode 2261.
Here "mod" is not a binary operator being applied to b. It's rather a qualifier which modifies the meaning of =, you should [parse this basically as a special ternary operator. However, usually c is not changing. It might have been better if we notated thins like this
a =_c b
where c is a subscript modfying the equals symbol.
The meaning of this notation is, c | (a - b)
that is, that c divides the difference of a and b.
For any number a, the "congruence class " of a modulo c is the set of all b which is it is congruent to. Congruence is an equivalence relation: It's always the case that a = a mod c, because 0 is divisible by every number. Also, if a = b mod c then b = a mod c, and if a = b mod d and b = c mod d, then it's easy to see a = c mod d.
So, for any fixed number c, the set of all integers is partitioned into a collection of disjoint sets, the congruence classes modulo c. There are exactly c such classes, corresponding to the integers between 0 and c. Every other number is congruent to one of those -- you can figure out which one by taking the remainder after dividing by c, which is the conventional definition of "modulo".
One of the basic ideas of algebra is that just as we can add and multiply two numbers, it turns out that we can add and multiply two congruence classes, and the result will be unambiguously a congruence class. This is a bit counterintuitive at first. To do it, we simply take any element of the first congruence class, and any element of the second congruence class, add or multiply them as required, then compute the congruence class of the result. Naively, different choices could lead to different outcomes, so this wouldn't be well-defined, but it turns out that it always produces the same answer.
Let's introduce some new notation: Say that a mod c is a binary operation which takes two integers, and produces a set of integers, the congruence class of a mod c. This map is well-studied in group theory -- it is the canonical map from the integers Z to the quotient group Z / cZ. (The quotient group Z / cZ is the group whose elements are the congruence classes of Z mod c.)
If we do it that way, then we can reinterpret the standard notation "a = b mod c" as a succinct way of writing "a mod c = b mod c", that is, a and b are congruent if they correspond to the same congruence class.
Then, the most important properties of congruence and modulo can be written as follows:
(a + b) mod c = (a mod c) + (b mod c)
(a * b) mod c = (a mod c) * (b mod c)
These identities are extremely important, they enable computational shortcuts when you only care about the answer modulo some number. For instance, suppose I have some polynomial x^2 + 100x - 57, and I want to evaluate it mod 5 at some value x = 1701. Because of these identities, I can always reduce the input and any intermediate result mod 5 eagerly throughout the computation, without changing the final result. So
(1701)^2 + 100*1701 - 57 mod 5
= ((1701 mod 5)^2 mod 5) + (100 * 1701 mod 5) - 57 mod 5
= 1 + 0 - 2 mod 5 = - 1 mod 5 = 4 mod 5
This basic idea underlies important algorithms like fast exponentiation by repeated squaring. It's used essentially in Diffie-Hellman key exchange for instance, and it has numerous other applications in computer science.
In group theory and algebra, the quotient group construction is greatly generalized, and applies to lots of other situations. "Properly", algebraists think of "modulo" as yielding a congruence class, not a particular number.
In some arguments it is necessary to find "congruence class representatives", so you might say, the numbers 0, ..., c-1 are a set of representatives for the congruence classes mod c, and you might want a map that maps a number to its unique congruence class representative.
The modulo operation which appears in programming languages is an example of such a map, it's certianly useful. It produces for each number the remainder after division, which is in particular a "simpler" member of the congruence class.