What House of Cards Teaches Us About the Value of Good Research

House of CardsWarning: some spoilers ahead!

I’ve been watching House of Cards, and like a lot of people, I’m morbidly fascinated by Frank and Claire Underwood’s evil machinations as they connive and kill their way to becoming the most powerful couple in the free (ha!) world. One thing that has struck me is how absolutely crucial good research is to their success.

The journalists in the show, of course, do a lot of digging as they try to uncover the truth behind Peter Russo’s death, but it’s the investigative research that Doug Stamper, Frank’s trusty assistant/dirty-work-doer, does that underlies major plot points and propels Frank’s career aspirations forward.

We see this in action as early as Chapter 2 in Season 1: Stamper uncovers an anti-Israel editorial that ran in the college newspaper that Michael Kern, the proposed candidate for Secretary of State, edited. When the story gets picked up by the media, the ensuing controversy for Kern seals the nomination for Frank’s preferred candidate.

Another point at which research helps Frank outmaneuver others is in Chapter 12, when Doug’s sleuthing reveals that, contrary to what President Walker had claimed, he and the billionaire Raymond Tusk are actually close friends. He discovers this by digging up the travel schedules of each man and identifying a number of instances where both men were in the same city at the same time (the president’s schedule would be publicly available, but I’m not sure how he could’ve gotten Tusk’s schedule; we’ll overlook that unexplained detail…). With this knowledge, Frank is able to remain in control in his interactions with Tusk, who, as he discovers later, is playing him.

Doug Stamper may be inscrutable and a bit creepy, but he’s a damn fine researcher. He has that magical combination of qualities that makes a good researcher: he’s smart, focussed, diligent, and detail-oriented, but also able to think outside the box and connect the dots. Frank’s success depends as much on Doug’s ability to dig up valuable gems that he can use to his advantage as it does on anything else. I’m only a couple of episodes into Season 2, but no doubt there will be other examples of Doug’s indispensable research.

Good research can make the difference between a successful leader and an also-ran. Good research provides knowledge, and knowledge is indeed power.

Photo source: House of Cards’ Facebook page

Be Safe Online in 2014

ones and zerosOver at Slaw, Dan Pinnington has a series of posts (which originally appeared in LAWPRO Magazine) about protecting yourself online from the myriad scams and security risks that can afflict the unsuspecting or careless internet user. He tackles the dangers lurking in email, how to recognize and avoid surfing dangers, and how to avoid infections with anti-virus and anti-malware software. The posts are aimed at the legal profession, but anyone who needs a basic introduction to online security can benefit from them.

Just what can criminals do with your hacked email account or computer? Brian Krebs has a couple of eye-opening posts describing the value of a hacked email account (iTunes accounts sell for $8 each!) or a hacked PC. This post provides some excellent advice for defending your PC against attacks.

For additional reading, Lifehacker has some good articles on online security as well. And if you’re a Mozilla Firefox or Google Chrome user (you should be), here are some resources for securing your browser:

So start the year off right by reading up on cybercrime and taking some simple steps to make sure you don’t fall victim to it.

Photo source: Mario & Amanda, Flickr

Newspaper Databases: All That’s Not Fit for Research

newspapersThey say newspapers are the first draft of history. They capture and disseminate noteworthy events as they unfold, and they are used by succeeding generations to make sense of a nation’s history and identity. Although this chronicling is increasingly occurring with dizzying speed on the web, newspapers, especially the paper editions captured in databases, will remain a fundamental resource for scholarly and other types of research for the foreseeable future.

Limitations of Full-Image Databases

There are, however, a number of problems with using newspaper databases for research. In his article “Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010,” Prof. Ian Milligan identifies one such issue: the shortcomings of optical character recognition (OCR) in databases of scanned microfilm. In a related post, he notes that keyword searches in databases that contain digitized, full-image versions of newspapers often result in incomplete retrieval of articles, due to the nature of the scanned material, the speed with which these databases were created, and the technological limitations. Consequently, research results can be problematic:

[H]yphenations are not covered (problematic in smaller columns, where Woodwork might be hyphenated as Wood-work across two lines), if microfilm streaks obscure a letter, if it was slightly tilted, or if the OCR just plain misses a character.* 

Prof. Milligan likens using these databases uncritically for historical research to “using a volume of the Canadian Historical Review with 10% or so of the pages ripped out.” While he recognizes that these databases are indispensable tools, he urges researchers to be aware of their limitations and to identify how they dealt with them.

Not Just a Database Issue

As a former researcher at Canada’s “newspaper of record,” I have additional concerns about relying on newspapers and newspaper databases for research. Despite the best efforts of reporters, editors, researchers, and archivists, news articles have long been replete with inaccuracies and omissions. The reasons are numerous and have to do with both structural and human shortcomings: the fast pace of news production, the lack of access to sources and resources, the lack of space, human error, editorial bias, editorial decision-making regarding which corrections are worth appending, etc. Once news articles make it into databases, other problems arise: graphics are not rendered in text-based electronic databases, databases have search and display technical shortcomings, etc.

Add to these the continuing economic pressures facing news organizations, which have necessitated deep cuts at many newspapers, and using newspapers as research sources has become increasingly problematic. In the seemingly endless rounds of layoffs since the start of the Great Recession, copyeditors, researchers, and enhancers/archivists — the guardians of accuracy, clarity, and order — have been the worst hit, while reporters and editors are being expected to do more with less. Errors, omissions, bias, and inconsequential content are now baked into the newspaper product, and this will have deep consequences for future scholarship and research.

All this to say that cautionary notes like Prof. Milligan’s are welcome and necessary, and researchers should always, always cross-reference research results with multiple and varied sources.

*It is my understanding that an upgraded database for The Globe and Mail’s Canada’s Heritage from 1844 is in the works that will address some of these shortcomings. For example, it will use higher quality OCR and will search and identify articles as a whole, even across pages.

Photo source: Jon S, Flickr

Digging for Data

graphsWhen searching for data and statistics, usually the best approach is to first consider who would be interested in the information. If you can identify the organization or group that has a need to know the information in order to operate, or is mandated to collect and disseminate the data, you are halfway to finding the data, or at least to assessing whether the data exists.

But what if the information is obscure or the source nebulous? Until recently, conducting this kind of research on the web was difficult, if not impossible. Advanced Google syntaxes are useful, as is adding the word “database” to your search terms, but these methods go only so far since Google doesn’t index the deep web, where such information usually exists. A few search engines, however, have recently made this kind of research much easier.

Zanran is one such search engine. The clever idea behind it is that images often contain numerical data. The search engine finds these images and indexes the surrounding text. It currently extracts tables and images from HTML, PDF, and Excel files and promises to add PowerPoint and Word documents in the near future. It’s a good resource for finding obscure statistics, or at least identifying possible sources by finding related information.

Quandl, another search engine for data, is impressive in its scope, transparency, and ability to download datasets in a number of formats. It has so far indexed 8 million time-series datasets from 400 quality sources. Scroll down to the bottom of the page of results to see information about the frequency of the data, the date the search engine retrieved the data, a link to the original source, and other relevant information.

DataMarket is a portal to free and proprietary datasets. It is aimed at the enterprise market, but it is free to search and create charts and visualizations of the public data. Find the list of data providers here.

Finally, the University of Auckland Library’s OFFSTATS is worth bookmarking. It is not a search engine, but a directory of official statistical sources on the web, organized by country, region, subject, or a combination of categories. It is a handy resource to consult for locating official sources.

These resources certainly make researching data and statistics easier and more fun. Know of other good statistics search engines or meta-sources? Please share them in the comments!

Photo source: Iman Mosaad, Fickr

The Ethics of Social Media Cyber-Sleuthing

social media

Without a doubt, social media and social networking sites like Facebook, Twitter, LinkedIn, and countless others have become indispensable tools in conducting background investigations, due diligence, employment pre-screening, and other types of investigations. Pursuit Magazine recently had a good two-part series that covered not just pointers to some lesser-known social media sites, but also discussed the importance of adequately capturing and presenting the information found on these sites.

The articles also highlighted some ethical and legal issues around gathering such information, advising, for example, against using shady techniques like pretexting and password cracking to gain access to protected material. Additionally, in Canada, a number of laws – notably human rights and privacy laws – govern the types of information that may be gathered on social media and elsewhere, the methods used for gathering the information, and the decisions made based on the information.

To stay on the side of the law, it is crucial for organizations and investigators to exercise caution when researching, collecting, and disclosing personal information about individuals. The Information and Privacy Commissioner of British Columbia has released some guidelines for social media background checks (PDF), identifying some pitfalls and issues to keep in mind:

  • Accuracy of information (Is it the right profile? Was the profile created by the individual himself or herself? Is the information current?)
  • Collecting irrelevant or too much information
  • Over-reliance on consent

Exercising good judgment when trawling social media sites isn’t just a matter of law and ethics; it can also save the organization from embarrassment, a lesson that the Toronto Star learned the hard way when it published false allegations against an Ontario MPP based on an old Facebook photo. The newspaper issued a rare front-page apology, citing an “egregious lapse” of standards.

Photo source: Jason Howie, Flickr

Roundup of Subject Guides and Directories

Gwen Harris’s post about the WWW Virtual Library — a directory of recommended web resources in various subject areas started back in the day by Tim Berners-Lee — inspired me to do a quick roundup of a few of the most useful and well-kept subject guides and directories I’m aware of.

In the early days of the world wide web, when the number of websites was small, directories were common and extremely useful in locating websites in an organized way. Perhaps the best-known one was Yahoo!, which was a hierarchical directory before it was a search engine (it still maintains a directory). Today, with billions of websites online, directories and subject guides are arguably even more important to help direct us to vetted, high-quality sources of information and save us from flailing around on search engines. As Gwen notes, however, subject guides/directories are a dying breed because of the amount of work involved in their upkeep.

Some of the guides that are updated regularly include:

  • The Virtual Private Library: A massive (almost overwhelming!) list of resources, branded as Subject Tracers, on a number of research topics. If you’re looking for comprehensiveness rather than curation, these lists are chock-full of links in various subject areas.
  • Toddington’s Free Online Open Source Intelligence (OSINT) Resources: A compendium to their paid knowledge base, this page lists links to useful resources in a number of categories, to help online research and investigative professionals.
  • University library sites: University libraries sites provide wonderful guides and pathfinders to reliable research resources. While the material tends to be academic and scholarly (obviously) and is often limited to the library system’s holdings, they can provide research direction for an unfamiliar subject area, and with a little resourcefulness, one can often access the material in other collections. The University of Toronto Libraries Research Guides and the Harvard Library Research Guides are two good ones, or search for “LibGuides” and your subject area of interest to find others.

What are some of your favourite subject guides and directories?

The Periodic Table of Business Research Databases

I just came across a terrifically handy tool from Alacra called the Periodic Table of Business Research Databases for identifying the right database for business-related research. There are a vast number of databases available on the market, each with its own focus and content depth. This tool provides a nice, quick and dirty overview of most of these information sources, identifying them by the various categories of research (company profiles, credit and investment research, market research, news, etc.).

Unfortunately it is missing some key Canadian databases, such as Infomart and Newscan, but I’ll be bookmarking it.

Free Speech and Privacy at Work

When can an employee’s off-duty web postings or other activities be reasonably monitored and controlled by an employer in order to protect the business? This has been a recurring question since the rise of the public internet and especially of social media. This article does a nice job of reviewing the case law and dissecting the issues of privacy, free speech, and employer loyalty with regard to online postings, and notes that:

It is fair to say that although technology has changed the playing field, the principles with respect to off-duty conduct in Canada have not changed. As long as employees must remain subordinate and loyal to their employer, there are limits to what they can express, even on their own devices and even if they are off-duty.

While an employer can never completely control the online behaviour of its employees, it can manage the delicate balance between protecting the business and respecting the rights of employees to privacy and free speech by putting in place “a well-drafted and well-communicated policy which clearly identifies acceptable workplace practices and use of company equipment as well as personal equipment, both at work and off work.”

Are You a Skilled Googler?

Most of us think we’re great Googlers. And it’s a testament to Google’s strength as a mostly reliable search engine that we do usually find what we’re looking for with a few simple keywords. But beyond the quick factual search, things can get tricky, and as a number of studies have shown, most of us miss good information on the open web due to our limited search skills (and here it’s worth noting that less than 10% of online information is actually available on the open web via search engines; the other 90% resides on the deep or invisible web).

There are a number of ways to improve your search skills. While Google appears simple and intuitive on the surface, its power can best be harnessed with some training, and Google provides a number of online training guides to help improve the search skills of its users. Two self-paced courses have been developed for power searching and advanced power searching, and this course, geared to students and their teachers, provides lesson plans and trivia challenges. Also available are webinars that guide the user through a variety of tools and techniques to find higher quality sources more easily.

But no matter how advanced a Googler you become, you’ll be missing a lot of good information if you rely solely on Google. Other search engines such as Bing and DuckDuckGo index the web differently and have different ways of prioritizing results. (See this slide deck from Karen Blakeman of RBA Information Services for some alternatives to Google.) And as mentioned before, only a small fraction of online information is indexed through search engines; countless specialized databases and indexes provide high-quality material that won’t appear in search engine results.

By the way, Google has come up with a fun way to put your Google search skills to the test. A Google a Day is a daily puzzle that can be solved by using clever search skills on Google.