If you are in business, research or technology, odds are you’ve heard about the term ‘big data’. Definitions range from ‘more data than you can understand’ to ‘a change in how we process and interpret, with more information than we know what to do with’. The exact science is still a debate, but the impact is not: while not the be-all and end-all solution to every business problem, it nevertheless represents a change in where we’re going and what we’re capable of with learning, research and technology. Even if your business is not in a position to use big data, best to understand and get comfortable with it now, because big data is not going away. With the ‘internet of things’ becoming more and more commonplace, generating seemingly endless data about our behaviours, patterns, and surrounding environment, big data can lead to new possibilities in our world, if only we can figure out how best to use it. For me, the impact is a bit more immediate: building a business to be the information and data go-to here in Halifax, I’ve already been asked what it means and where my services fit when companies want to get on board. It’s not something I have a lot of hands-on experience with, but it is something I try to keep up on. Thus, when I heard that an international big data conference was planned, within my own backyard nonetheless, I jumped at the chance to get involved.
If you were at the 2015 Big Data for Productivity Congress, odds are we crossed paths: I was the gal by the registration table with the shoulder cropped hair & Twitter earrings. Bless the vendor in Dubai who sold me those earrings, I wear them every technology conference. As a volunteer, most of my time at the congress was to keep things running smoothly for guests: checking in registrations, answering questions, escorting to meetings and relaying errands between event management & guests. I also had the opportunity to sit in on some huge presentations, join more than a few thought-provoking conversations, and meet individuals from across the globe I might never otherwise have had the pleasure of speaking with. Here are only a few of my observations over conference’s three days:
Mind = Blown
This happened a few times, but was particularly highlighted during Plenary Speaker Ray Kurzweil’s address on Tuesday. Of critical discussion in his talk is how fast technology is evolving and will evolve, growing at an exponential speed (2, 4, 6, 12) rather than the linear model humanity follows (1, 2, 3, 4…) Technologies that are still treated as novelty, such as 3D printing, will continue to become more commonplace and accessible, as what we can do in with them will be game-changers. We sit at a time in the world when our very biology is being examined as heavily-advanced code, and as we quickly unlock that code, the ability to rewire our systems to say, reject cancer, fight ebola or scan our systems and replace vital organs with customized biologically compatible replacements, will be here.
The future of classification will be very different
This is more for my fellow professionals of information science: if you are presently in the records management field or related and intend to remain so for your career, now is the time to keep an eye on the shift of what we do and why we do it, because when it comes to traditional cataloging and tagging practices, the computer is already posed to take your job. OCR, which is needed to keyword search scanned documentation, is also increasing in scope and improving in reliability: still not perfect, but getting there. On the flip side, classification is still a vital part of our record and data practices, in particular towards privacy and security. To put it bluntly, if you don’t know what information you have or who has access, how will you know what was lost in the event of a security breach? How can you safely say that only those with the need to know have access to the information your business contains? Classification is also critical for discovery of information and data when their connection is based on abstract concepts: projects or programs where details on the core components do not necessarily share a common timeline, location or document attributes.
As smart as your database is, make sure someone still checks that data.
Perhaps someday machines won't need us; but right now that’s not the case. Even with the cleanest dataset, you should still give your information an overlook with human eyes. This does not mean every line needs to be checked for quality (impossible when it comes to the increasing volume of data being processed), but random testing is still a must. Errors happen, issues remain between the translation of 'what you ask for' and 'what the computer thinks you asked for', and there’s little worse than committing significant dollars or customer trust on a project, only to discover there is a fundamental flaw with the information itself. Like the feature article that gets reviewed by the office editor before final print, or the software test that happens before and after upgrades, going over the information for quality control before using it is a must. I’ve spoken to clients on this before, but it was refreshing to be vindicated by the expert of Google Search.
The diversity of interests and multiple applications.
Heath. City planning. The Internet of things. Academia. Government. Big business. Aerospace. Even if your company doesn’t process big data as part of it’s operations, based on the works in progress and the plans for the future, big data will make an impact. How can it not? Big data is becoming part of nearly every industry at some level, and as the understanding and technology increase, so too do the applications. Wherever you are, whatever you do, big data will affect you: if you ever use a Google Search or flight with an airline, it already has.
Better understandings of Visual Data
I'm a proponent of visual data already, mostly because we has humans process visual information instinctively; writing and reading are learned, but sight, smell, touch, hearing and taste are hard-wired into our biology for collecting and disseminating details. I love infographics, keep & share ones I enjoy all the time, and turn to them myself as teaching aides for different ways to communicate and have information ‘stick’. Visual data is aesthetically pleasing, often able to make an even larger impact with an intended audience. With the right layout and use it can even ascribe stronger impressions, such as empathy or anger, towards neutral statements or displays. What I was unaware of however, is that visual data not only allows better presentation of information, but it can also be a boon to the research itself: it can highlight where to look further, note quality issues that must be addressed, and make connections that might otherwise be missed. Of course we need to be careful of getting caught in believing there’s correlation, but that’s always been an issue we need to keep aware of.
All in all, the Congress was excellent: a sea of driven, passionate individuals, organizations, government departments and companies, all sharing knowledge to work out where they can harness increasingly growing data resources, and use them to benefit current projects and solve problems. For my part, I admit I’m not a Data Scientist: while I have an interest in analytics, my drive tends to veer less towards what information can tell us, and more towards how we as humans can best access, acquire, share and use the information we are increasingly bombarded with. Nevertheless, I’d be lying if I said I wasn’t excited by the research possibilities of those presently in the field. I earnestly look forward increasingly recognizing big data in action, and returning to memories of ground-breaking talks like those presented, to see how far things have come.