Volume 18, Number 3/4
Table of Contents
Social Awareness Tools For Science Research
Tamara M. McMahon
University of Kansas Medical Center
James E. Powell, Matthew Hopkins, Daniel A. Alcazar, Laniece E. Miller, and Linn Collins
Los Alamos National Laboratory
Ketan K. Mane
Renaissance Computing Institute (RENCI)
Point of contact for this article: Tamara M. McMahon, firstname.lastname@example.org
Tools for social networking and social awareness are developing rapidly and evolving continuously. They are gaining popularity in a growing number of professional as well as personal activities, including scholarly research. There are social awareness tools for science researchers that facilitate collaboration, help manage references, and offer options for presenting findings in new ways. This paper discusses those tools. Evaluating and understanding their functionalities requires effort, and scientists can be reluctant to invest the necessary time to learn to use and populate them on their own. This suggests that an important role for librarians is to evaluate the many social awareness tools available, to recommend the ones that are best suited to each researcher's needs, and to help researchers use those tools effectively.
As social networking tools compete to touch all aspects of our lives, it is likely that science and scientific research will be increasingly affected. After all, social networks have always been a catalyst for scientific research from large formal conferences to weekly department get-togethers. Research must begin with an idea, and collaboration must begin with a connection. Social networks facilitate scientific progress by mixing people and ideas; social networking tools bring this mixing online. Now mini-conferences can be held in Google+, while spontaneous discussions erupt on Facebook, serendipitous meetings are enabled by Foursquare, and the insights from all these interactions are disseminated via Twitter. The rapid and sustained adoption of these tools is making such scenarios more common. In just over two years of existence, Foursquare has surpassed 10 million users. (Indvik, 2011) Meanwhile, Facebook has 800 million active users, half of whom access the site on any given day. (Facebook, 2011) As these tools connect more people, they create more paths for ideas to travel, and subsequently more opportunities to seed new ideas. Once the crucial idea is sparked and the necessary collaborators are linked, the research begins, and a different set of tools becomes necessary.
This paper discusses social awareness tools developed specifically for the scientific researcher tools to facilitate collaboration, tools to manage article references, tools to present findings in new ways. They are tools that the authors have used at the Los Alamos National Laboratory (LANL) Research Library. The Library's Knowledge Systems team created ScienceSifter, SAT, and EXPAT, while the Library's Customer Service team has discussed VIVO, Mendeley, and SciVee with interested researchers. Our scientists are not opposed to learning a single tool, but they have neither the time nor inclination to test all the different options themselves. We suspect this is not unique to our institution, but common across academia. One of the authors discovered similar concerns when the Penn State Hershey College of Medicine implemented the Harvard Catalyst Profiles tool. The implementation revealed how little scientists know about such tools and how little time they have to learn them. This suggests a role for librarians: to study the many social awareness tools available, to recommend the best one for each researcher's particular needs, and to help that researcher climb the tool's learning curve.
Social Awareness Tools versus Social Networking Tools
Social tools can be broken down into two main types: social networking and social awareness. In this paper, we define social networking tools as those that build upon people, and social awareness tools as those that build upon data. Social networking tools allow a user to connect with others and utilize these connections to create networks. Social awareness tools, on the other hand, allow one to see or manipulate data about people, such as co-authorship networks. They allow the researcher to become aware of new social connections through the ability to view and combine data in different ways. These tools, however, are not mutually exclusive. A social tool that combines both social networking and social awareness elements provides a powerful framework for advancing research.
Social awareness tools and social networking tools are similar in many respects. A human element exists in social awareness tools, but data has an equally important role. Social networking tools, on the other hand, rely much more extensively on the human element, such as "friending" someone on Facebook, sending a tweet on Twitter, or adding someone to a Google+ circle. There is a required human action for these tools. As a result, an element of trust exists for the user within social networking tools, and recommendations and collaborations are built upon this trust. Social awareness tools build recommendations based on data. Some go so far as to predict successful collaborations for researchers who have never met, physically or virtually, in varying areas of research based solely on data. (Weber, 2011).
Unlike social networking tools, people may not even realize they are using social awareness tools due to a lack of conceptual understanding or familiarity with such tools, especially when a social awareness tool is used within the framework of a social networking tool. The Friend Finder functionality and the targeted ads within Facebook are examples of "embedded" social awareness tools. These recommendations are based on data analysis. While many find the targeted advertisements invasive or annoying in a consumer setting, the use of such data mining techniques as components of a research tool could connect researchers and ideas that might otherwise never have the opportunity to interact.
The Scientific Application of Social Awareness Tools
The ubiquity of Facebook, Google+, Twitter and Foursquare all but guarantees that scientists number among their frequent users. But do these social networking tools have anything to offer the practice of science? Can libraries use these tools not just to connect with researchers, but also to enhance the research process?
It is instructive to look to an earlier innovation in scientific communication the journal article. Compared to the time frame involved in the publication of a book, the journal article allowed scientists to disseminate their ideas quickly to a wide audience in a more concise format. A similar expediency also describes some of today's most popular social networking tools, like Twitter. Furthermore, the journal article shares another important similarity with social networking tools. Unlike email, chat or text, communication across the social network plays out in the public sphere, where it can be aggregated and analyzed.
A scientist's journal article is intrinsically valuable to science in a way that his tweets or status updates are not. But like a component of a social network, the article has additional worth simply as one of many nodes in a network of literature. Networks form around all the ways articles connect to one another citations, subjects, co-authorships, affiliations, etc. Each of these networks can be mined. Citation analysis is perhaps the most familiar example. High citation counts hint at influential papers and researchers, while certain patterns of citations might reveal related sets of articles or topics that bridge disparate fields.
Social networking chatter itself may contain informative patterns. Just as one can read a scientist's peer-reviewed, published articles, one may also follow that person's blog posts, online comments, shared videos, and so on. The latter communications may contain few novel scientific insights, but in aggregate, combined with those of his colleagues, and viewed across a network, these abbreviated communications may reveal aspects of science itself: the connections between researchers and disciplines, or the growth and emergence of ideas. The more social networking tools proliferate and flourish, the greater the potential exists for social awareness tools to harness data and enhance our understanding of the research process.
Examples of Tools for Researchers
VIVO and Harvard Catalyst Profiles
VIVO and Harvard Catalyst Profiles, commonly referred to as Profiles, are open source research networking tools often described as "Facebook for Researchers." (Vence, 2009) Both are funded by the National Institutes of Health (NIH) to support national networking for biomedical researchers. These systems allow researchers to create a profile describing their research areas, titles, education, grant awards, and publications. Authoritative information is harvested from external databases, such as PubMed, and a variety of internal databases. In Profiles, keywords are automatically generated from the MeSH (US National Library of Medicine, 2011) terms associated with harvested publications. Profiles software provides Active and Passive networks. The Active Networks are created by the person maintaining the researcher's profile page, in a way similar to Facebook. The person selects his collaborator. The Passive Networks are automatically generated based on data about the researcher. For example, MeSH terms associated with a researcher's publication can be used to link other researchers with similar interests (Weber, 2011). Profiles can predict potential collaborators based not only on matching keywords, but also on centrality metrics (Newman, 2001) calculated automatically through the social network analysis functionality. Both Vivo and Profiles provide powerful co-author visualizations, and allow users to glean information about funding and topics necessary for making future research decisions.
The Research Networking Group of the Clinical and Translational Science Awards (CTSA) consortium initiated the Direct2Experts (Distributed Interoperable Research Experts Collaboration Tool) pilot project, which includes Profiles and Vivo. The aim of this project was to design a national federated network of researchers across multiple institutions incorporating data from tools and platforms local to that institution. The initial 28 institutions successfully completed the project by creating a tool that provides the institutions as much control as possible over their data while creating a federated search. While this product should be considered a proof-of-concept, it is an example of just how far and fast a research network can grow.
Mendeley is an online service along the lines of Facebook or Flickr designed to help researchers manage and share their PDF files. Public collections share reading lists and associated metadata with the world at large. Smaller shared collections can include the full-text PDF articles. (Barsky, 2010) The creators attempted to mimic the music service Last.fm, which allows users to catalog their music, but at the same time anonymously aggregate data about listening preferences. Similarly, Mendeley was designed to create a way to help manage academic papers and anonymously track reading habits to show trends such as popular papers and key researchers within the various communities. By aggregating metadata, tags, and usage, Mendeley hopes to become an alternative to pay-walled databases. (Henning, 2008)
ScienceSifter (Collins et al., 2005) was designed as a personalized RSS feed aggregator which focuses on the challenge of sifting through current scientific literature, in order to facilitate shared intellectual activity awareness among group members. Designed using open source tools, ScienceSifter provided an opportunity for the group of researchers or channel editors (trained librarians) to set up a collaborative space to share current scientific literature of interest. Setting up a collaborative space was a two-step process, in which the researchers: a) entered keywords of their interest, and b) selected different sources to aggregate current scientific literature. Once the collaborative space setup was complete, the ScienceSifter underlying architecture and algorithms were designed to use keywords of interest to automatically filter and aggregate multiple RSS feeds information. At a given time, results were displayed in one of the three different formats: list; a list with descriptions; and or a hyberbolic tree visualization. Users of the group had an option to save items as part of their shared list. Thus ScienceSifter provided an opportunity for the researchers to save the amount of time they would normally spend in finding the current literature in the area of interest, and also save on time to setup user access privileges.
SAT & EXPAT
When you combine populations of users with social tools such as Facebook and Google+, social networks inevitably form. These networks are not just a function of technology but also human nature. Similarly, in scientific research, the end product includes several inherent mechanisms that make explicit the network of collaborations that resulted in the publication that describes it. The co-authorship listing, for example, delineates the network of collaborators for a paper. Broaden this network to all papers co-authored by each co-author, and soon you may end up with a fairly extensive and interesting network that is both temporally and topically diverse. It is through such networks that users may broaden their knowledge of a given topic, or serendipitously discover new material.
Through semantic transformation of bibliographic data, the Knowledge Systems Team at Los Alamos National Laboratory formally represents these co-authorship networks using the Friend of a Friend (FOAF) ontology. FOAF describes people, and the relationships between them. The fact that two or more authors co-authored a paper is represented as a foaf:knows relationship between those authors. In this way, as data is mapped from a collection of bibliographic records, a social graph for these records is built. The tool built by the Knowledge Systems Team for exploring graphs like this is called the Social Awareness Tool, or SAT (Powell, et al., 2010). SAT renders a visualization of a co-authorship graph where authors are nodes and the edges connecting authors represent the knows relationship. As one might imagine, with as few as a hundred records, the graph can grow to be quite large, so two techniques are employed to better enable users to explore the graphs. First, a user can search for text that occurs in a particular document's title or abstract. The resulting social network is a subset (a subgraph) of the entire co-authorship graph, where only authors and co-authors associated with papers that matched the search are returned. Secondly, network centrality measures are utilized to highlight certain special nodes. These centrality measures are calculations that can be used to determine things like the node with the most connections (degree) or the node that connects clusters of other nodes (betweenness). The SAT will generate views where the node with the highest degree centrality, or the node with the highest betweenness centrality, is highlighted.
As noted above, there are some additional relationships specified in the semantic representation generated for bibliographic data. Subject-author relationships form the basis of a second tool called the EXpertise Awareness Tool (EXPAT (Powell et al., 2010). EXPAT graphs show subject headings associated with authors. In the SAT, edges between nodes represent knows relationships, but the EXPAT is what is known as a bipartite graph, where two types of nodes occur in one graph. Author names are one of the node types, and subject headings are the other. Edges represent a relationship between authors and subjects. Although these graphs become complex very quickly, several query options are offered to limit the number of nodes displayed. The interdisciplinary nature of the work of some researchers is quickly and dramatically apparent in such graphs, as are the numerous connections among researchers in a particular field.
These tools use semantic representations of bibliographic data combined with basic search for extracting specific subgraphs that are then presented visually. A similar, but hypothetical analogy would be, for example, the ability to view a stripped-down map of the "friend" relationships in Facebook for a set of users, based solely on interests or content in status posts. This may not be an optimal method for locating friends on a social networking site but it works quite well for research purposes. Furthermore, it is difficult to imagine how one might enable users to explore these relationships in a conventional text interface.
SciVee is a website where researchers, students and educators can upload and share their published scientific articles (including posters and slides) and integrate them into a video called a "PubCast", which allows authors to discuss and highlight the important points of their published articles (displayed next to the video) while relevant text or figures synchronously appear. (Fink & Bourne, 2007) The PubCast (essentially a multimedia presentation) is a dynamic form of communication specifically designed to engage its viewer. It gives the researcher higher content information than an abstract, requires just a few minutes to read, and requires much less time than reading a full scientific article, which can take several hours. (Timer, 2007) More importantly, with the PubCast's design having more visual appeal than an article alone, a greater interest in the article by way of increased views and downloads is generated. According to a 2008 study conducted by SciVee, PubCasts were shown to increase article views and downloads by 75 percent over time. Interactive options for the user include viewing the video alone, viewing the video and article together, or the audio alone. Finally, the viewer also has access to the author's original published paper via a link within the PubCast screen.
Launched in 2007, the primary goal of SciVee was to create a website for scientists and researchers to enable them to promote their work and collaborate, since according to Phil Bourne, co-founder of SciVee, "the text-only world of scholarly publishing no longer suffices in the age of the Internet video and social media." (Meredith, 2010) With its inherent "do-it-yourself" service, researchers can integrate a video commentary with their scientific publication, which will "enliven their Web presence" and "satisfy their audience's need for dynamic content." (Meredith, 2010) Furthermore, an increasing number of researchers are incorporating video into their scientific publications.
As a social awareness tool, SciVee provides the researcher with a multimedia presentation that makes scientific content more accessible, engaging and even more enjoyable, while also providing a quicker means to view the work of other scientists and to form collaborations. From a scientific standpoint, this makes it a considerably more desirable tool than the more mainstream social networking sites such as YouTube. More importantly, it provides researchers with an effective way in which to potentially increase the number of views of their publications and broaden their audience. (Fink & Bourne, 2007) The developers of SciVee predict that today's generation of graduate and post-doc student scientists will help to incite a "revolution in scientific communication", since "cyberinfrastructure" was, in actuality, part of their daily life while growing up and so it is quite natural for them to perform research solely within an electronic framework. (Fink & Bourne, 2007) Given the PubCast's dynamic attributes, it seems reasonable to conclude that this particular form of scientific communication will only serve to present science in a way that is more accessible and engaging for the viewer than text alone, and with SciVee's target audience being primarily scientists, researchers are also provided with an effective medium from which to distribute their publications, increase article views, view the work of other scientists, and form collaborations.
Why use these tools
These tools allow users to acquire an overview and a greater understanding of current research fields and emerging fields. (Howard, 2011) Trending research areas can reveal themselves in social networking chatter long before publications, or even preprints, have caught up. The aware researcher might publish early on an emerging topic and thereby enjoy the first-mover advantage of a high citation count. Similar tools have been proposed in other fields, for instance, to predict the stock market (Bollen, et al., 2011) or detect emerging geo-social events. (Lee, et al., 2011) Why not science? New collaborations can be discovered, and these tools allow researchers to easily leave their research silos. Serendipity will increase, as will interdisciplinary work. Social awareness tools, while foreign and cumbersome to many, are increasingly becoming the norm for early career and future scientists. Knowledge and use of these tools will provide a competitive edge, as administrators and funding agencies are increasingly using these tools during the decision-making process. Social awareness tools will have a profound effect on scientific discovery both today and tomorrow.
Social awareness tools can enhance the research process. Tools like VIVO and Profiles can connect researchers to collaborators; ScienceSifter can connect researchers to scientific literature; and the SAT and EXPAT tools can do both. While a paper is being developed, authors can discover and manage new references in Mendeley. When a paper is completed, a site like SciVee can move its core ideas beyond the static page. Social awareness tools can benefit researchers, and that alone makes them of interest to research librarians. Librarians may be called upon to troubleshoot issues for those who are already using these tools. They might recommend and teach these tools to other patrons. Librarians themselves can employ these tools to answer reference questions, or to facilitate their own special projects. Organizations of all kinds, including libraries, must leverage widely-used technologies if they are to develop a trusted digital presence. But more than that, libraries and librarians are a crucial nexus between the information scientists who develop these tools and the research scientists who use them. In teaching these tools, librarians are well placed to discover ways to improve them. And as shepherds of the research process, they are well placed to imagine and ask for new tools that have not yet been theorized. Thus, libraries can shape not just the preservation and discovery of scientific research, but also its creation.
 Barsky, E. (Summer 2010). Electronic Resources Reviews and Reports. Issues in Science and Technology Libraries, 62.
 Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), pp.1-8. http://dx.doi.org/10.1016/j.jocs.2010.12.007
 Collins, L. M., Mane, K. K, Martinez, M. L. B., Hussell, J. A. T., & Luce, R.E. (2005). ScienceSifter: Facilitating activity awareness in collaborative research groups through focused information feeds. 1st IEEE International conference on e-Science and grid computing (e-Science 2005), Melbourne, Australia. pp. 40-47. http://dx.doi.org/10.1109/E-SCIENCE.2005.72
 Facebook, Statistics.
 Fink, J. L. & Bourne, P.E. (2007). Reinventing Scholarly Communication for the Electronic Age. CT Watch Quarterly, 3(3).
 Henning, V. (2008). Mendeley A Last.fm for Research? Fourth IEEE International Conference on eScience. http://dx.doi.org/10.1109/eScience.2008.128.
 Howard, J. (September 11, 2011). Citation by Citation, New Maps Chart Hot Research and Scholarship's Hidden Terrain. The Chronicle of Higher Education.
 Indvik, L. (June 20, 2011) Foursquare Surpasses 10 Million Users. Mashable.
 Lee, R., Wakamiya, S., & Sumiya, K. (2011). Discovery of unusual regional social activities using geo-tagged microblogs. World Wide Web, 14, pp.321-329. http://dx.doi.org/10.1007/s11280-011-0120-x.
 Meredith, D. (2010). SciVee: Integrating Video into Scientific Publication. Research Explainer.
 Newman, M.E.J. (2001). Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E 64, 016132. http://dx.doi.org/10.1103/PhysRevE.64.016132.
 Powell, J. E., Collins, L. M., Martinez, M. L. B. (July/August 2010). Semantically Enhancing Collections of Library and Non-library Content. D-Lib Magazine. http://dx.doi.org/10.1045/july2010-powell.
 Timer, J. (2007). Science gets its YouTube on with SciVee. Ars Technica.
 US National Library of Medicine. Medical Subject Headings (MeSH), MeSH.
 Vence, T. (November 11, 2009). The NIH's $12.2M Facebook for Researchers. Biotechniques.
 Weber, G. M. (2011) Harvard Catalyst Profiles: Converting a Research Networking Product to Use Linked Open Data and the VIVO Ontology, Vivo National Conference, August 24-26, 2011, Washington, DC.
About the Authors
Tamara M. McMahon is a Clinical Informatics Coordinator at the University of Kansas Medical Center and a member of the Biostatistics Department where she extracts and maps electronic medical records and provides informatics consultative services supporting clinical and translational researchers. Prior to her current position, she served as Knowledge Integration and Emerging Technologies Librarian at The Pennsylvania State University, and on the Knowledge Systems and Human Factors team at Los Alamos National Laboratory. Her background in human factors and information science allows her to bring people, data and technology successfully together. Her interests include ontology development, medical informatics, information visualization, knowledge discovery systems and large scale data repositories. Tamara holds a Master of Information Science degree from Indiana University.
James E. Powell is a Research Technologist at the Research Library of Los Alamos National Laboratory, and a member of the Knowledge Systems and Human Factors Team where he develops digital library, semantic web, and ubiquitous computing tools to support various initiatives. He has worked in libraries off and on for over 20 years, including eight years at Virginia Tech University Libraries, where he worked on the Scholarly Communications project and participated in several collaborations between the library and the Computer Science department's digital library group. He later went on to assume the position of Director of Web Application Research and Development at Virginia Tech, and to lead the Internet Application Development group, before joining LANL.
Matthew Hopkins is a Library Professional at the Research Library of the Los Alamos National Laboratory, where he is a member of the Customer
Service team. He focuses on collection management, usage metrics, and the library's web presence. He received his MLS from the University of North Carolina-Chapel Hill.
Daniel A. Alcazar is a Library Professional at the Los Alamos National Laboratory, Research Library. His work focuses on unclassified publications, interface usability testing, and usage metrics. Prior to joining the Los Alamos National Laboratory, he served as Branch Manager with the Albuquerque/Bernalillo County Library System, New Mexico. Mr. Alcazar received his MLIS from the University of Southern Mississippi, Hattiesburg and B.A. in Biology from the University of New Mexico, Albuquerque.
Laniece E. Miller is a graduate research assistant at the Los Alamos National Laboratory's Research Library. She recently finished a Masters of Science in Library and Information Studies at Florida State University.
Linn Collins is a Technical Project Manager at the Los Alamos National Laboratory, where she leads the Knowledge Systems and Human Factors Team at the Research Library. Her team focuses on applying semantic web and social web technologies to challenges in national security, including situational awareness, nonproliferation, and energy security. She received a doctorate in educational technology from Columbia University in New York, where her dissertation was on semantic macrostructures and human-computer interaction. Prior to LANL she worked at IBM Research on Eduport and the Knowledge and Collaboration Machine, and at the Massachusetts Institute of Technology on Project Athena.
Ketan K. Mane is a Senior Research Informatics Developer in the Health Informatics and Bioscience Group at Renaissance Computing Institute (RENCI). His research work focuses on applying visual analytics approaches in decision support role to help clinicians identify viable treatment options at the point of care. Dr. Mane has a background in biomedical engineering, and holds a Ph.D. in Information Science from Indiana University, Bloomington (Advisor: Dr. Katy Borner). Prior to joining RENCI, he was a member of the InfoViz lab at Indiana University directed by Dr. Katy Borner. He has also worked as a Postdoctoral Research Fellow at Los Alamos National Lab (LANL). His research interest include: information visualization, visual analytics, comparative effectiveness research, decision support tools, health informatics, knowledge domain visualization