for any real or complex numbers x and y, and any non-negative integer n. The binomial coefficient appearing in (1) may be defined in terms of the factorial function n!:

Examples
Taking n to be 2, 3, 4, or 5 in the binomial theorem yields

Combinatorial proof

Example
The coefficient of xy2 in

(x + y)3 = (x + y)(x + y)(x + y) = xxx + xxy + xyx + xyy + yxx + yxy + yyx + yyy =

x3 + 3x2y + 3xy2 + y3.

equals because there are three x,y strings of length 3 with exactly

two y's, namely,

corresponding to the three 2-element subsets of {1,2,3}, namely,

where each subset specifies the positions of the y in a corresponding string.

General case
Expanding (x + y)n yields the sum of the 2n products of the form

where each ei is x or y. Rearranging factors shows that each product equals

xn − kyk for some k between 0 and n. For a given k, the following are proved

equal in succession:

- the number of copies of xn − kyk in the expansion
- the number of n-character x,y strings having y in exactly k positions
- the number of k-element subsets of
- (this is either by definition, or by a short combinatorial argument
- if one is defining as ).

This proves the binomial theorem.

Inductive proof
Another way to prove the binomial theorem (1) is with mathematical induction.

When n = 0, we have

For the inductive step, assume the theorem holds when the exponent is m.

Then for n = m + 1

by the inductive hypothesis

by multiplying through by a and b

by pulling out the k = 0 term

by letting j = k − 1

by pulling out the k = m + 1 term from the right hand side

by combining the sums

from Pascal's rule

by adding in the 0 and m + 1 terms.

Newton's generalized binomial theorem
Around 1665, Isaac Newton generalized the formula to allow exponents other than nonnegative integers. In this generalization, the finite sum is replaced by an infinite series. Namely, if x and y are real numbers with x > |y|,[1] and r is any complex number, then

When r is a nonnegative integer, the binomial coefficients for k > r are zero, so (2) specializes to (1), and there are at most r+1 nonzero terms. For other values of r, the series (2) has an infinite number of nonzero terms, at least if x and y are nonzero.

The coefficients can also be written

where is the Pochhammer symbol. This is important when one is working with infinite series and would like to represent them in terms of generalized hypergeometric functions. This form is used in applied mathematics, for example, when evaluating the formulas that model the statistical properties of the phase-front curvature of a light wave as it propagates through optical atmospheric turbulence.[citation needed]

Taking r = −s leads to a particularly handy but non-obvious formula:

Further specializing to s = 1 yields the geometric series formula.

Generalizations
Formula (2) can be generalized to the case where x and y are complex numbers. For this version, one should assume |x| > |y|[1] and define the powers of x + y and x using aholomorphic branch of log defined on an open disk of radius |x| centered at x.

Formula (2) is valid also for elements x and y of a Banach algebra as long as xy = yx, x is invertible, and ||y/x|| < 1.

For a more extensive account of Newton's generalized binomial theorem, see binomial series.

The binomial theorem in abstract algebra
Formula (1) is valid more generally for any elements x and y of a semiring satisfying xy = yx. The theorem is true even more generally: alternativity suffices in place ofassociativity.

The binomial theorem can be stated by saying that the polynomial sequence is of binomial type.

History
This formula and the triangular arrangement of the binomial coefficients are often attributed to Blaise Pascal, who described them in the 17th century, but they were known to many mathematicians who preceded him. The 4th century B.C. Greek mathematician Euclidknew a special case of the binomial theorem up to the second order,[2][3] as did the 3rd century B.C. Indian mathematician Pingala to higher orders. A more general binomial theorem and the so-called "Pascal's triangle" were known to the 10th-century A.D. Indian mathematician Halayudha, the 11th-century A.D. Persian mathematician Omar Khayyám, and 13th-century Chinese mathematician Yang Hui, who all derived similar results.[4]