Gloucestershire Local History banner

Simple Image Scanning for Local Historians

Guidelines No. 7         Issue 1.1         February 2005


These guidelines have been prepared in response to a request from the Local History Association Computer Group for very basic advice on scanning photographs, negatives, slides and documents. There are many excellent web sites that offer advice on this subject but all too frequently the sites omit the detailed advice that a newcomer needs. To be told "ensure that you use an appropriate scanning resolution" is not very helpful when you are starting out and may be unclear what "an appropriate scanning resolution" might be. This article therefore includes some very specific suggestions about matters such as scanning resolution and image file format. However, it must be stressed that they are only given so that local historians wishing to make use of digital images in their projects can get started. It is very important to experiment with different parameters as more experience is gained.

The advice is given in the form of answers to the following questions:-


1. What can be done with digital images?
2. What equipment will I need as well as my computer?
3. Which image file formats should I use?
4. Which storage media should I use?
5. What software do I need to scan and edit images?
6. What settings should I use when scanning?
7. What changes can I make to my digital images?
8. Which are the most useful web sites for help with scanning matters?

1.       What can be done with digital images?          [top]

Digital images obtained using a flatbed scanner, or a dedicated 35 mm slide scanner or a digital camera can be put to a large number of uses. These include:-

  • Included in articles published in paper form
  • Included in web pages or articles published on the Internet.
  • Stored on the hard disk of a computer or on a Compact Disc (CD) to form electronic picture albums.
  • Images produced by scanning text may be converted into word processor documents using Optical Character Recognition (OCR).

2.       What equipment will I need as well as my computer?          [top]

Note: The term samples per inch (s.p.i.) is a more correct term in some (but not all) instances for what is commonly referred to as dots per inch (d.p.i.).
  • A flatbed scanner for documents: Prices start at £40 and go up to about £200 for 'home' models. The minimum scanning resolution to purchase is about 1200 samples per inch. Many scanners are now supplied with transparency adapters which are capable of producing high quality images from 35mm slides and negatives. At the end of 2004 flatbed scanners with a resolution of 4800 samples per inch were available at a cost of about £300.
  • A dedicated film scanner for 35mm slides or negatives. These cost between £200 and £500 for 'home' models. They typically have scanning resolutions in the range 1800 to 2700 samples per inch.
  • A digital camera. The price of these is now falling rapidly. A good choice would be a 6 or 8 mega (million) pixel camera

It will be almost essential to have a CD writer so that you can store your images away from your computer (see Question 4). These are now less than £50 for a model which is fitted inside your computer. If you wish to add a CD writer to an existing computer system and you wish to avoid opening up the computer to fit an internal model it is possible to buy an external unit. These are usually connected via a Universal Serial Bus (or USB) connector which most modern computers have. An external CD-writer requires its own case and power supply and so tends to cost £30 to £50 more than similar internal devices. See Guidelines No 1 and No 2 in this series for general advice about computers and software.

3.       Which image file formats should I use?          [top]

There are literally dozens of image file formats but in order that as many people as possible can read your images it is important to stick to what has become the standard format for 'master copies' of image files - the TIF (or TIFF-Tagged Image File Format) format. There are a number of variants on TIF format and so it is best to store images without using any file compression. Although this results in larger file sizes compared to using compressed files it is safer for archive purposes as virtually all image editing programs (see Question 5) can read uncompressed TIF files. The format is a 'lossless' one and so unlike so-called 'lossy' formats no information is lost when the image is stored. Lossy formats such a JPG (developed by the ISO Joint Photographic Expert Group) are extremely useful for displaying images on web sites as they can be compressed into very small file sizes and can be readily obtained from the TIF master files.

The PNG (Portable Network Graphics) is a new standard for images used on the Internet but it has made little ground over the JPG format.

An excellent discussion on image file formats is available at www.rwsh.com.

To Summarise: It is suggested that uncompressed TIF files are used for master images but JPG and other formats can be used to distribute and display versions of the images.

4.       Which storage media should I use?          [top]

Recordable Compact discs (CDs) represent a very good performance to cost ratio. Each CD can store about 650MB of data. Thus about 100 images produced in the GSIA slide scanning project referred to in Section 6 can be stored on a single CD. Andy Fadden's CD FAQ web site (http://www.cdrfaq.org) gives a vast amount of technical information about CDs.

The terms CD-ROM, CD-R and CD-RW are encountered frequently so what is the difference between them? In fact all three look physically very similar but they differ considerably in respect of how the information in stored. A CD-ROM refers to a CD produced by pressing a master disc against a blank CD similar to how a vinyl audio record is produced. A CD-R is written in a CD writer using a laser beam and once an area on the disc has been written that area cannot have its data overwritten. CD-RW refers to re-writable CDs which are also written using a laser beam but unlike the CD-R they can be erased and rewritten a large number of times. Nearly all modern CD writers are what is known as CD rewriters and can create both CD-R and CD-RW disks.

No one really knows the lifetime of CDs and estimates range from less than a year to 100 years! It makes great sense to ensure that multiple copies are made of important data and the CDs stored a separate locations. It is also a good idea to use good quality 'branded' CDs and to use two or more different manufacturers for the multiple copies.

A development of the CD is DVD which looks identical physically to a CD. Its capacity is up to 14 times that of a CD. The price of DVD writers has fallen significantly in the last year and they are likely to replace CDs in the relatively near future.

In comparison, a floppy disk can only store about 1.4MB (1.4 million bytes) of data and so their use is very limited for storing images.

5.       What software do I need to scan and edit images?          [top]

Nearly all scanners and digital cameras come with basic software to enable you to scan and load the images onto your computer and most of them provide software to edit the images (such as resize, alter the colour characteristics and crop). Excellent scanning software that works for many flatbed and film scanners has been produced by Ed Hamrick in the USA. This product is called Vuescan and is now used by the Author for most of his scanning projects. It costs $40 (about £30) and a demonstration version can be downloaded free of charge which produces a 'watermark' in the images until a license has been purchased.

Adobe Photoshop (Version CS) is still the industry standard for editing images but the full product costs more than £450 (inc.VAT). However, Adobe has released a cut-down version known as Adobe Photoshop Elements (Version 3) This has all the features most users require and is an excellent choice. It costs about £55 (inc.VAT)and is widely available. At about £80 (inc.VAT) Paint Shop Pro(Version 9) is a very popular and powerful program.

The Gimp (General Image Manipulation Program 2.0) is a very powerful freeware program which the developers claim is almost as powerful as the full Photoshop program. Xnview is a popular freeware image editing program that is worth considering.

The Irfanview image viewing software is freeware and is an extremely powerful and versatile program. It will rapidly create 'thumbnails' (small images) for all the images in a directory. Other features include batch processing where the same edit commands are automatically carried out for a number of files and the ability to create slide shows.

6.       What settings should I use when scanning?          [top]

Most scanning software allows you to control the scanning resolution (the number of samples of the image per inch captured in each direction) and the image mode. Not surprisingly your choice depends on the nature of the original and what use will be made of the digital version of the image. It is useful to start by considering when to use the three main image modes (monochrome, greyscale and colour) and the corresponding file sizes.

  • Line art, that is simple drawings or text, should be scanned in monochrome mode (sometimes called bi-tonal or black and white mode). The information from eight pixels (picture elements) of the image will be stored in each byte (each of eight 'bits of information') in the image file which results in very small file sizes compared to the two other modes.
  • Black and white photographs are made up of a large number of 'grey's and so greyscale (or grayscale) mode is used for these. In this mode each pixel is represented by one of 256 different levels of grey ranging from pure white through to pure black. The information from just one pixel will be stored in each byte and so greyscale images give 'medium' sized files.
  • Colour images are split into three or more 'layers' where each 'layer' is composed of a single colour and the most common arrangement is the so-called RGB mode where the image is broken up into Red, Green and Blue 'layers'. Three full bytes are now required to store the information from each pixel and it is easy to see why high resolution colour scans can produce extremely large files.
  • To summarise: Ensure you are using the correct scanning mode for monochrome, greyscale or colour originals. It is certainly possible to scan line art in 24 bit RGB mode but the file sizes will be 24 times larger than they need be!
  • File size considerations. Assume for a moment that each pixel you get from the scanning process will result in one pixel for output on devices such as a computer monitor (screen) or printer. Computer monitors found in the home currently employ screen resolutions of typically 800 x 600 or 1024 x 768 pixels. These correspond to 0.5 to 0.8 million pixels and if it is a colour image then the actual file sizes will be about 1.5 and 2.5MB, respectively. However, if good quality prints are required then an image resolution of between 100 and 150 pixels per inch is required and on an A4 sheet of paper this adds up to between 9 and 22MB for colour.

Some suggested 'starter' settings. Remember, it is very important to experiment with different parameters. If you wish to print enlargements of the original you will need to increase the minimum scanning resolution by the same ratio as you want for the size of the enlargement to the original.

  • Text scanning for optical character recognition (OCR) or as page images for say Adobe Acrobat PDF files is normally carried out at 300 samples per inch.
  • Scanning of Line Art should in principle be carried out at 600 samples per inch if the output is to be printed full size on a 600 dots per inch laser printer. In many cases a 300 samples per inch scan will give results that are almost as good as a 600 samples per inch scan and the file will be only one quarter the size. Ink jet printers are often capable of higher resolution such as 1200 dots per inch and, in principle, a higher resolution scan could be made to match this. However, in many cases the difference in quality will not be noticeable but the file size will be four times larger.
  • Colour and Black and White pictures to be printed at the same size of the original need a minimum scan at about 150-200 samples per inch. If enlargements up to twice the size of the original are required the original should be scanned at a minimum 300 - 450 samples per inch.
  • When scanning slides with a dedicated film scanner (for home or semi-professional use) use the maximum scanning resolution - typically 1800-2700 samples per inch. While this sounds very impressive it must be remembered that the area of a slide or negative is a mere 36mm x 24mm (1.5 x 1 inch). Thus a full scan at 2700 samples per inch will yield a file that is about 10 million pixels and occupy 29Mb for a colour image. However, acceptable quality prints of up to A3 size will be possible with such a file.
  • Image files are large files so inevitably the scanning resolution adopted will be a compromise between image quality and file size. If there are many slides to be scanned as is the case with a project that the Gloucestershire Society for Industrial Archaeology (GSIA) is carrying out then some compromise will be necessary. However, many of these slides were taken more than 30 years ago and beginning to deteriorate and experience has shown that very acceptable results are obtained using half the maximum resolution (1350 samples per inch). This results in a working value of about 6MB per uncompressed TIFF image and means that about 100 images may be stored on a single 660MB CD-R. These files give high quality A4 colour prints. Obviously, when required, selected images can be scanned at the maximum resolution that the equipment permits.

Ideally you should record information concerning the caption, date, creator and the various settings within the 'header' of the image file. Only certain file formats allow this information, which is known as metadata to be stored in this way. Any further discussion of metadata is outside the scope of the present article but there is a lot about metadata on the Internet

7.       What changes can I make to my digital images?           [top]

It will depend on your software but most programs will enable you to:-

  • Reduce or enlarge the size of the whole of the image.
  • Crop the image (retain just a specified rectangular portion of the image).
  • Change the image resolution (The number of pixels per inch).
  • Alter brightness or contrast of the image.
  • Adjust the red, green and blue colours in the image independently.
  • Change colour images to greyscale images.
  • Create special effects using tools called 'filters'.
  • Add text to the image.
  • Add lines or patterns to the image.
  • Sharpen images.

The ability to sharpen images is very important as the scanning process invariably 'flattens' the image, that is reduces its 'sharpness'. Appropriate use of a sharpness filter (e.g. the 'unsharp mask' filter) can make the division between different coloured areas in an image appear to be more pronounced and thus seem sharper. It is a very valuable tool that is often overlooked by beginners. However, if too much sharpening is used horrible effects will result. Where the degree of sharpening can be controlled sensible starting values of the three parameters that can be altered are amount = 150% radius=2 pixels and threshold=3. A threshold of 3 means that the colour values of adjacent pixels must be greater than 3 or the effect or sharpening will not be applied for this pair of pixels. The radius is the number of pixels surrounding the edge pixels that are affected by the sharpening and amount is how much the contrast of pixels is increased. This is where it is particularly important to experiment.

8.       Which are the most useful web sites for help with scanning matters?          [top]

There are many sites but the following are well worth a look:-

Waynes Fulton's Scanning Tips    (www.scantips.com) A very good introduction

Tips for Scanning - Peter G Aitkin    (www.pgacon.com/tips_on_scanning.htm)

Feedback on these notes will be welcomed. Please send them to the Author at ray.wilson@coaley.net.

[top]                 Ray Wilson January 2005