Barcodes are widely used around the world in many contexts and applications. The bars are machine-readable and typically represent a numeric or alphanumeric code.
The retail point-of-sale barcode design we know today, using vertical bars to represent a long number, was developed by George Laurer in the early 1970s, initially for the North American grocery industry but quickly expanding to all of retail. In the mid-1970s, he revised his invention to support one more digit so that it would be adopted in Europe.
The standards for these barcodes were managed by separate regional organizations at first—in North America by the Uniform Code Council, and in Europe by the European Article Number Commission—but they coordinated their activity in the 1990s, and fully consolidated in 2005 as the GS1 organization. GS1 sets the global standard for these barcodes and refers to them by the combined name UPC/EAN.
The way the bars are formatted and how they represent digits is called the barcode symbology. There are two main barcode symbologies for consumer retail merchandise: UPC-A and EAN-13.
UPC-A is a forward-compatible subset of EAN-13. Any UPC-A barcode can be read by an EAN-13 scanner, but only certain EAN-13 barcodes can be read by a UPC-A scanner. Since the early 2000s, barcode scanners have been more universal and programmable, able to read any UPC/EAN barcode and interpret it as however many digits are needed.
A human-readable version is an optional part of each symbology, so that the code can be manually entered when the bars can't be scanned properly.
UPC-A remains the dominant form of barcode in the US & Canada. EAN-13 is dominant everywhere else. However, both types of barcodes can be found on products worldwide; the type of symbology is not a foolproof indicator of an item's market region.
Barcodes started appearing on retail LP sleeves in the late 1970s, but adoption was slow in the record industry; it wasn't until circa 1986 that major-label releases consistently had barcodes, and there were still exceptions after that.
The actual standards are complex, but the digits represented by EAN/UPC can be thought of in this simplified way:
For the main 12 digits (the ones after the optional region code), each digit is directly represented by two pairs of bars and spaces of variable width. The choice of bar/space patterns used for encoding the first six digits depends on the region code, which is only implied, and which for a valid UPC-A must be 0. Thus, when the region code is 0, the bar/space patterns of UPC-A and EAN-13 are identical.
A pair of extra-long guard bars always appears in the middle and on each end of the barcode.
A human-readable interpretation (HRI) version of the barcode is often printed below the bars, sometimes with the first and/or last digit separately to one side.
(As mentioned, this is a simplified interpretation; in reality, the company prefix is of variable length; and the 11 digits used for the combination of company prefix and article number are flexible—some companies may be assigned a longer prefix, leaving fewer digits for unique article numbers.)
UPC-A and EAN-13 differ in what bar patterns are allowed for the first six digits of the encoded string:
UPC-A and EAN-13 differ in the length of the bars:
UPC-A and EAN-13 differ in how the HRI text is presented:
Japan has its own version of EAN-13 called JAN (Japanese Article Number). The bars are the same as EAN-13, and thus encode 13 digits like normal. However, on older items, the HRI may be printed in a machine-readable manner compatible with legacy OCR (optical character recognition) devices unique to Japan. When the HRI is machine-readable, the font is special, the bars do not extend down in between any numbers, and the text is preceded by the letter T.
A barcode reader only reads across the pattern of black and white bars. It does not read the HRI, and it does not check the length of the bars. Therefore, the scanner does not know whether it is reading UPC-A or EAN-13, unless it finds that the first six digits in the bars use one of the patterns that implies a non-zero region code. In other words, when the region code is zero, a barcode reader cannot be trusted to correctly identify the type of barcode; it could be UPC-A or EAN-13.
Any EAN scanner can read a UPC-A barcode, but an old UPC-only scanner cannot read an EAN-13 barcode at all unless the region code is zero.
Scanners manufactured after 2004 can read any UPC/EAN, and will internally interpret the bars as a 13-digit string. However, if the first digit is zero, then depending on how the reader is configured, it may report the string as an EAN with all 13 digits, or for compatibility with old UPC systems, it may drop the initial zero and report the string as a 12-digit UPC.
Free barcode-reading apps for smartphones and tablets often just report everything as an EAN.
Here is the back cover of a UK-market CD showing an EAN-13 barcode:
Here is the back cover of the corresponding US-market CD showing a UPC-A barcode:
The bar patterns, as seen by a scanner, are exactly the same. Regardless of which item is scanned, an EAN scanner will report 13 digits (0077779431021), and a UPC scanner (or a modern scanner programmed for UPC compatibility) will report 12 digits (077779431021).
It is factual to say:
The GS1 standard dictates that a UPC-A barcode may be decoded as a 13-digit number by adding an implied leading zero to the GTIN-12. Because of this, and because of the way scanners and vendor databases operate, it is arguably valid to say:
This overlap and ambiguity in the terminology makes it difficult to precisely describe Barcode fields on Discogs. A description of "EAN", for example, may refer to the type of barcode image, or it may refer to a given scanner's numeric interpretation of the bars. It would be confusing to refer to an "EAN" on a release which has a UPC-A type of barcode, even though the bars can be interpreted as a 13-digit EAN string. Therefore, although it is allowed to enter the 13-digit string, most users do not mention the type of barcode at all unless both types are printed on the same release (a very uncommon situation pictured below). If you do wish to mention the type of barcode, do not just rely on the barcode reader; use your eyes to confirm the actual type of barcode.
Here is a sample template plus a real world example, of how such data can be captured based purely on the HRI information:
( ) Blank space
(x) Numbers from 0-9
UPC-A : x xxxxx xxxxx x EAN-13 : x xxxxxx xxxxxx UPC : xxxxxxxxxxxx (12-digit no spacing) GTIN-13 : xxxxxxxxxxxxx (13-digit no spacing)
UPC-A : 6 02498 83994 2 EAN-13 : 0 602498 839942 UPC : 602498839942 GTIN-13 : 0602498839942
Here is an unusual example showing the back cover of a CD marketed in the US and Germany with different barcode types (and entirely different digit strings) for each region:
Sometimes, a UPC/EAN barcode is followed by an "add-on symbol": a secondary, supplemental barcode which encodes only two digits (or sometimes five), with the HRI printed above instead of below the bars. For example, the short bars on the right side of this image, with "97" above them, constitute a 2-digit add-on to the UPC-A barcode on the left:
These extra digits are for things like magazine issue or product version numbers. For example, CBS Records in the US & Canada sometimes used add-ons for reissue numbering (presumably): the first release on a particular format might have no add-on, a reissue might have an 02, the second reissue 03, and so on.
The add-on symbol is part of the barcode, but is not the primary identifier of the item. Many scanners ignore the add-on and only report what is in the main barcode.
UPC/EAN inventor George J. Laurer commented on his blog that from a technical capability standpoint, UPC-A was "always" meant to represent a 13-digit code. He made it capable of 12 digits, circa 1973, and expanded it in a clever way to 13 digits several years later, when he devised the EAN-13 encoding. He says that for political reasons, GS1 and its predecessors never accepted this point of view, instead promoting the idea that UPC-A contains only 10, 11, or 12 digits. To this day, GS1 requires that UPC-A only be used for 12-digit codes, and forbids UPC-A from using the bar patterns for non-zero country codes. Laurer believes this arbitrary restriction will eventually be lifted.