nowcasting & placecasting of patent quality around the ... · nowcasting & placecasting of...
TRANSCRIPT
Nowcasting & Placecasting of Patent Quality around the Globe– A Temporal Semantic Similarity Approach to Patent Impact Prediction –
October 23, 2018
Daniel S. Hain † Roman Jurowetzki † Tobias Buchmann ψ Patrick Wolf ψ
[email protected] [email protected] [email protected] [email protected]
† IKE, SDS, Department of Business and Management, Aalborg University, DK
ψ Center for Solar Energy and Hydrogen Research Baden-Württemberg (ZSW)
1
Background & MotivationWhat we do, how, and why
“Ideally, we might hope that patent statistics would provide a measure of the output of inventiveactivity, a direct reading on the rate at which the potential production possibilities frontier is shifted
outward. The reality, however, is very far from it. ”— [Griliches, 1990, p.1669]
Patents & innovation performance: Whats done so far...I Widely accessible measure of inventive/innovative activity/performance [Griliches, 1990].I Yet, technological as well as economic significance of patents varies broadly [Basberg, 1987].I Existing approaches to create indicators of patent quality mainly centered around:
I Number/composition of IPC assignments [Lerner, 1994]I Backward citations backward [Shane, 2001; Trajtenberg et al., 1997]I Forward citations [Trajtenberg et al., 1997]I Lately, first attempts to introduce text (keyword) based indicators [Arts et al., 2017]
What we do instead...Derive a quality measure on patent level capturing its level of ex-ante novelty as well as ex-posttechnological significance.
1. Exploit the rich textual information contained in patent abstracts with semantic embeddingtechniques to create a high-dimensional vectors, capturing their technological features.
2. Relational mapping of similarity structures between patents (network analysis)
3. Temporal mapping of technological similarity between patents (lead-lag analysis)
4. Prediction of ex-post quality indicators with deep learning powered forecasting techniques(nowcasting).
5. Provide interactive visualization with high granularity (placecasting).
2
Methods: Pipeline OverviewOur Approach in a Nutshell
3
Methods: 1 - Patent-to-VectorOr: How to take text, and come up with numbers.
Sentence embedding problemI Sentence and phrase representation is a “dynamic” research area.I Evolved from: simple bag-of-words → topic models → word embedding based
neural net approaches (in only some years).I We use a TF-IDF weighted average word embedding representations.I Result: 300-dimensional “meaning vector”
electrical_connector characterised
by a receptacle containing a plurality of female_contacts having redundant_contact portions and wiping_capabilities with respect to male_pins
vec1 vec2 Vec3+ +
TF-IDF - weighted - embeddings
0.60.1 0.3
target = ”Cats are beautiful animals.”test = "Felines are gorgeous creatures.”test2 ="Dolphins are swimming mammals."
print(target.similarity(test))print(target.similarity(test2))
Out: 0.9522696443950177Out: 0.7822956256736615
Semantic Vector – First Quality CheckI Use a classification benchmark setup: How well does the latent vector predict
some known discrete feature → We use a simple feed forward neural net topredict IPC3 Classification and achieve around 83% multi-class predictionaccuracy (out-of-sample).
4
Methods: 2 - Vector-to-SimilarityOur: How to create a “really huge” similarity matrix
Similarity calculation issuesI Next task: create relational (dyadic) similarity/distance measures of technological
distance.I Traditional approach: Create Euclidean-distance or cosine similarity (optimally,
sparse-) matrix.I However, scales very bad, since matrix size increases exponential (nmatrix = n2
obs)I For small case-studies borderline-feasible, for all available patents (ca. 20m)... we
need really a lot of computing power [e.g., the "admittedly very brute-force"approach by Younge and Kuhn, 2016]
...Maybe we need something slightly more practical...
5
Methods: 2 - Vector-to-SimilarityOur: How to create a “really huge” similarity matrix
A slighly more pragmatic approachI Leveraging Sportify’s Annoy (Approximate Nearest Neighbors Oh Yeah,https://github.com/spotify/annoy)
I C++ library with Python bindings to search for points in space that are close to agiven query point.Creates large read-only file-based data structures that aremmapped into memory.
I Result: Near-linear scalable workflow for similarity mapping of large datasets
6
Methods: 3 - Similarity-to-Quality
Temporal similarity: IntuitionI Semantic similarity independent of timeI Temporal similarity distribution can be exploitedI Inspired by the "lead-lag" approach of Ramage
et al. [2010]; Shi et al. [2010].
Temporal similarity: TypesSimilarity to past
I Exploitation of existing knowledge.I High values might indicate backward
orientation, low values indicate novelty.
Similarity to presentI “Riding the wave”, indicates activity in a
trending area.
Similarity to futureI Shaping the agenda, indicator of future impactI Also: Indicator of “Window-of-Opportunity”,
high growth technological field or area ofapplication
7
Illustration: A little Case StudyElectro-mobility
I Electromobility related patents based on expert-adviced IPC class selectionI Further, all patents cited by “seed” also included (ca. 13k)
IPC class Level Description
B60L 11/00 0 Electric propulsionB60L 11/02 1 using engine-driven generatorsB60L 11/04 2 using dc generators and motorsB60L 11/06 2 using ac generators and dc motorsB60L 11/08 2 using ac generators and motorsB60L 11/10 2 using dc generators and ac motorsB60L 11/12 2 with additional electric power supplyB60L 11/14 2 with provision for direct propulsionB60L 11/16 1 using power stored mechanicallyB60L 11/18 1 using power from primary cells
Statistic N Mean St. Dev. Min Max
∆t 76,990 1.095 0.855 -2.591 4.556tpast 76,990 0.209 0.100 0.000 0.813tpresent 76,990 0.292 0.092 0.000 0.917t future 76,990 0.499 0.135 0.043 1.000sim 76,990 0.557 0.067 0.124 0.781simpast 76,990 0.116 0.059 0.000 0.476simpresent 76,990 0.165 0.058 0.000 0.600simfuture 76,990 0.276 0.081 0.026 0.656
0
2
4
6
0.0 0.2 0.4 0.6 0.8
sim
dens
ity
sim.type
sim
sim.future
sim.past
sim.present
Distribution of similarity
8
Illustration: A little Case StudyElectro-mobility
Figure: Similarity to Past vs. Future: Country Level
CN
FR
IL
TW
JP
IT
DE
NL
CH
CA
KR
SE
FI
GB
US
AT
0.08
0.09
0.10
0.11
0.12
0.25 0.26 0.27 0.28
Similarity to future
Sim
ilarit
y to
pas
t
0.1
0.2
internal
N Patents
5000
10000
15000
9
Illustration: A little Case StudyElectro-mobility
Figure: Similarity to Past vs. Future: Development over Time
CA
CN
DE
FR
GB
JP
KR
US
0.0
0.2
0.4
1980 1990 2000 2010
Year
Sim
ilarit
y to
futu
re
Country
CA
CN
DE
FR
GB
JP
KR
US
10
Illustration: A little Case StudyElectro-mobility
I Obviously, different levels of aggregation deliver different insights.
Figure: Firm Level
FUJITSU TEN LTD
TOYOTA MOTOR CORP
TOYOTA JIDOSHA KK
NISSAN MOTOR CO LTD
DAIHATSU MOTOR CO LTD
0.10
0.12
0.14
0.16
0.24 0.27 0.30
Similarity to future
Sim
ilarit
y to
pas
t
N Patents
2000
4000
6000
0.05
0.10
0.15
0.20
internal
Figure: Technology Level
H02J
B60L
B60KH01M
C01B
0.075
0.100
0.125
0.150
0.200 0.225 0.250 0.275
Similarity to future
Sim
ilarit
y to
pas
t
N Patents
10000
20000
30000
0.1
0.2
0.3
0.4
0.5
internal
11
Illustration: A little Case StudyElectro-mobility
Figure: Knowledge flow network between countries
12
Methods: 4 - Nowcasting: Rare event predictionNeural approaches to anomaly detection
I Forecasting of similarities etc. pretty easy with deep learning or traditional ML.I However, more interesting: Rare events, like breakthrough patents.
Anomaly detection to spot the top 1%I Task: Identifying breakthrough patents [Ahuja and
Lampert, 2001]I Autoencoder: self-supervised model that aims at
reproduction of its inputsI Train on “boring normality”I High reproduction-error when facing “anomal inputs” →
“something is wrong”I Results so far: Very nice ROC (>0.8), high accuracy
(0.87) and sensitivity (0.81) out of sample.
13
Methods: 5 - Placecasting: The Global Patent ExlorernThe power of visualization and data-narratives
I So far so good, but after all we just produce numbers.I However, complex data pipelines are of little use without producing a narrative.I Therefore, we went a step further, and provide interactive visualizations of
geolocations, grannular geographical networks, etc. [and as a goodie, manytraditional patent measures by Squicciarini et al., 2013]
I Furthermore, all code and results soon available on GitHub, for reproducability,scrutiny and progress of research
www.gpxp.org
14
Conclusion & OutlookWhat we did, what’s on the list, and what we’re up to.
Helpful for policy?I Technology life-cycle studies [eg., Gao et al., 2013; Lee et al., 2016] to inform
investment decisions, policies, and theory.I Technology forecasting to enrich existing approaches [eg., Altuntas et al., 2015;
Kim and Bae, 2017]I More dynamic and forward-looking specialization and development decisions.I Better fund allocation,
Helpful for industry/professionals?I Identify trending and “neighbouring” technologiesI Investment decisions, VC, company valuation.
What next?I Ongoing: Evaluation of indicators, correlation to traditional indicators (works well)I Ongoing: Large scale expert assessment of technological distance, novelty, etc.I Better geocoding, particularly in AsiaI Vectorization: Seq2Seq Deep Autoencoder & LSTM – Latent layer. Difficult to
train. Currently does not outperform the simpler TF-IDF-avg. approach.I Forecasting: A really nice LSTM for continuous updates of prediction.
15
The End
Fin.
16
References ITo whom it may concern
Ahuja, G. and Lampert, C. (2001). Entrepreneurship in the large corporation: A longitudinal study of how established firms createbreakthrough inventions. Strategic management journal, 22(6-7):521–543.
Altuntas, S., Dereli, T., and Kusiak, A. (2015). Forecasting technology success based on patent data. Technological Forecasting and SocialChange, 96:202–214.
Basberg, B. L. (1987). Patents and the measurement of technological change: a survey of the literature. Research policy, 16(2-4):131–141.
Gao, L., Porter, A. L., Wang, J., Fang, S., Zhang, X., Ma, T., Wang, W., and Huang, L. (2013). Technology life cycle analysis method basedon patent documents. Technological Forecasting and Social Change, 80(3):398–407.
Griliches, Z. (1990). Patent statistics as economic indicators: A survey. Journal of Economic Literature, 28(4):1661–1707.
Kim, G. and Bae, J. (2017). A novel approach to forecast promising technology through patent analysis. Technological Forecasting andSocial Change, 117:228–237.
Lee, C., Kim, J., Kwon, O., and Woo, H.-G. (2016). Stochastic technology life cycle analysis using multiple patent indicators. TechnologicalForecasting and Social Change, 106:53–64.
Lerner, J. (1994). The importance of patent scope: an empirical analysis. The RAND Journal of Economics, pages 319–333.
Ramage, D., Manning, C. D., and McFarland, D. A. (2010). Which universities lead and lag? toward university rankings based on scholarlyoutput. In Proc. of NIPS Workshop on Computational Social Science and the Wisdom of the Crowds. Citeseer.
Shane, S. (2001). Technological opportunities and new firm creation. Management science, 47(2):205–220.
Shi, X., Nallapati, R., Leskovec, J., McFarland, D., and Jurafsky, D. (2010). Who leads whom: Topical lead-lag analysis across corpora. InNIPS Workshop.
Squicciarini, M., Dernis, H., and C, C. (2013). Measuring patent quality: Indicators of technological and economic value.
Trajtenberg, M., Henderson, R., and Jaffe, A. (1997). University versus corporate patents: A window on the basicness of invention.Economics of Innovation and new technology, 5(1):19–50.
Younge, K. and Kuhn, J. (2016). Patent-to-patent similarity: a vector space model.