top of page

NSF EAGER WEB OF INNOVATION

Sustainable Energy

Many of today's largest and most innovative and profitable firms were at some time ago small or medium-sized.  For example, think of Apple, Google, and Amazon. But the rate of change in our economy means that being large doesn't necessarily equate with longevity: The time it takes for a company to appear and then drop off the S&P 500 may be just 14 years by 2026, as discussed in this AEI blog.  In 1966, this was 33 years, and in 1990 it was 20.  This declining time horizon means that firms must do more to stay relevant to their customers.  One way for firms to increase their staying power is to continually invest in R&D to ensure innovation becomes a source of competitive advantage, rather than a threat. 

Small firm research and development (R&D) inputs and outputs can be hard to measure.  Traditionally, scholars use patent, publication, government data, and other data from primary collections (e.g., surveys or interviews).  Yet, not all small firms patent, and even fewer publish journal articles.  Business databases, on the other hand, may provide decent coverage, but they are pricey.  And, across many disciplines, researchers have been facing declining survey response rates for many years with individuals and organizations alike.  (See this National Academies Workshop Proceedings (2017) for more information on the current state of innovation indicators.)

 

This NSF EAGER Web of Innovation project turns to a different set of data, namely websites and search engines, to identify the limitations and opportunities associated with 'big data' sources for the study of small firm innovation.  The project continues my past work trying to understand online behaviors of individuals, institutions, and firms in a broader online ecosystem that captures some of what is being done in science, technology, and innovation today. 

 

Of course, not everyone and not every organization is online.  But, to not study these data carefully in the context of other, more well-studied data sources is a missed opportunity for advancing research: We are on a journey to understand what these new data tell us about phenomena we already know quite a lot about, e.g., why certain industries may innovate more than others.  At the same time, online platforms, whether social media or firm websites, might just change or supplement what we think we know, e.g., expected types of communications between actors, how firms market their capabilities, how these capabilities change over time, etc. In some cases, new uses of information and communication technologies may change the innovation process itself. 

To get to these interesting questions, we need good, reliable data.  Big data sources are not usually intended for research purposes, so improving data quality and producing reliable, meaningful measures are two important tasks.  The first part of this research, mostly completed as of April 2019, defines a method for identifying firm URL and employment data to generate a sample frame.  We are also able to collect firm website data and clean it for further processing.  One benefit of this approach is that it is fairly low-cost and transparent: the code and data are available for free online at GitHub.  Additionally, there is a companion workshop (with getting started guide) that walks students and other interested researchers through the method in six hands-on labs.  See the resources page for more details. 

 

The second part of this research, currently in progress, uses website data to explore entrepreneurial, small firm narratives. Narratives can be thought of as pre-packaged storylines that convey key facts and symbolic representations about who the firm is, what it offers, and how it produces and interacts with customers: Stories package “factual information about [a firm’s] stock of tangible and intangible capital into a simpler, more coherent and meaningful whole” (Martens et al., 2007). The analytical approach isolates narrative development on small firm websites by examining topical change that naturally occurs on select ‘about us’ pages. We operationalize our dependent variable as a series of changes in topics over time, as measured in distinct paragraphs.  

Stay tuned for published research outputs. The original proposal abstract is available here.  

bottom of page