Be Safe Online in 2014

ones and zerosOver at Slaw, Dan Pinnington has a series of posts (which originally appeared in LAWPRO Magazine) about protecting yourself online from the myriad scams and security risks that can afflict the unsuspecting or careless internet user. He tackles the dangers lurking in email, how to recognize and avoid surfing dangers, and how to avoid infections with anti-virus and anti-malware software. The posts are aimed at the legal profession, but anyone who needs a basic introduction to online security can benefit from them.

Just what can criminals do with your hacked email account or computer? Brian Krebs has a couple of eye-opening posts describing the value of a hacked email account (iTunes accounts sell for $8 each!) or a hacked PC. This post provides some excellent advice for defending your PC against attacks.

For additional reading, Lifehacker has some good articles on online security as well. And if you’re a Mozilla Firefox or Google Chrome user (you should be), here are some resources for securing your browser:

So start the year off right by reading up on cybercrime and taking some simple steps to make sure you don’t fall victim to it.

Photo source: Mario & Amanda, Flickr

Newspaper Databases: All That’s Not Fit for Research

newspapersThey say newspapers are the first draft of history. They capture and disseminate noteworthy events as they unfold, and they are used by succeeding generations to make sense of a nation’s history and identity. Although this chronicling is increasingly occurring with dizzying speed on the web, newspapers, especially the paper editions captured in databases, will remain a fundamental resource for scholarly and other types of research for the foreseeable future.

Limitations of Full-Image Databases

There are, however, a number of problems with using newspaper databases for research. In his article “Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010,” Prof. Ian Milligan identifies one such issue: the shortcomings of optical character recognition (OCR) in databases of scanned microfilm. In a related post, he notes that keyword searches in databases that contain digitized, full-image versions of newspapers often result in incomplete retrieval of articles, due to the nature of the scanned material, the speed with which these databases were created, and the technological limitations. Consequently, research results can be problematic:

[H]yphenations are not covered (problematic in smaller columns, where Woodwork might be hyphenated as Wood-work across two lines), if microfilm streaks obscure a letter, if it was slightly tilted, or if the OCR just plain misses a character.* 

Prof. Milligan likens using these databases uncritically for historical research to “using a volume of the Canadian Historical Review with 10% or so of the pages ripped out.” While he recognizes that these databases are indispensable tools, he urges researchers to be aware of their limitations and to identify how they dealt with them.

Not Just a Database Issue

As a former researcher at Canada’s “newspaper of record,” I have additional concerns about relying on newspapers and newspaper databases for research. Despite the best efforts of reporters, editors, researchers, and archivists, news articles have long been replete with inaccuracies and omissions. The reasons are numerous and have to do with both structural and human shortcomings: the fast pace of news production, the lack of access to sources and resources, the lack of space, human error, editorial bias, editorial decision-making regarding which corrections are worth appending, etc. Once news articles make it into databases, other problems arise: graphics are not rendered in text-based electronic databases, databases have search and display technical shortcomings, etc.

Add to these the continuing economic pressures facing news organizations, which have necessitated deep cuts at many newspapers, and using newspapers as research sources has become increasingly problematic. In the seemingly endless rounds of layoffs since the start of the Great Recession, copyeditors, researchers, and enhancers/archivists — the guardians of accuracy, clarity, and order — have been the worst hit, while reporters and editors are being expected to do more with less. Errors, omissions, bias, and inconsequential content are now baked into the newspaper product, and this will have deep consequences for future scholarship and research.

All this to say that cautionary notes like Prof. Milligan’s are welcome and necessary, and researchers should always, always cross-reference research results with multiple and varied sources.

*It is my understanding that an upgraded database for The Globe and Mail’s Canada’s Heritage from 1844 is in the works that will address some of these shortcomings. For example, it will use higher quality OCR and will search and identify articles as a whole, even across pages.

Photo source: Jon S, Flickr