Researcher ID: 123456
Each researcher is modelled as an excitatory neuron
Since we have the year of each publication we model the learning rate accordingly
i.e. the more recent a publication is the more influential it is on strength of connection between researchers
(Hover the mouse over each point to see the learning rate value for each year)
staff.csv
authors.csv
outputs.csv
Basic statistics about each researcher.
University structure.
Identify how many times each pair of researchers collaborated in any particular year based on how many papers they have published…
…identify the strength of internal and external connections between academic units of particular type (e.g. schools or faculties).
Model similarities between papers (similar titles), hence identify potential collaborations.
id | title | type | year | keywords | abstract | authors |
---|---|---|---|---|---|---|
53490655 | communication as information use | (book, chapter) | 2011 | NaN | introduction uncertainty is an unavoidable pro... | [1968, 12503] |
56453104 | tobacco | (book, chapter) | 2013 | NaN | NaN | [27487, 22878] |
id | forename | surname | published name | job title | organisation code |
---|---|---|---|---|---|
10925 | Jamie | Jeremy | [JY Jeremy] | [(, Emeritus, , , , , , Professor, )] | [SOCS] |
24576 | Siobhan | Shilton | [SM Shilton, Siobhan M Shilton] | [(, , , , , , , Reader, ), (, , , Research, , ... | [FREN] |
KME Turner | Katy M E Turner | Katy Turner | K Turner |
Katherine Turner | Katy M. E. Turner | K.M.E. Turner | K M E Turner |
K.M. Turner | Katherine M E Turner | Katherine M. E. Turner |
y-axis — number of different names used x-axis — number of people using y different names
There are around 1426 different job titles among 3263 researchers.
Some researchers have multiple job titles in the PURE record (pie-chart).
Queries such as "how many professors per department/faculty" are possible.
The most interesting job titles of people who published:
I reorganised the job titles into 9-tier hierarchy (the most popular title):
Some people are also associated with more than one academic unit…
The university is structured hierarchically, therefore it can be modelled as a tree. Due to its size linear trees are not comprehensible but radial trees are.
The structure is encoded as a nested JSON file:
{ "name": "...", "short_name": "...", "full_nume": "...", "url": "...", "type": "...", "people": ["...", "..."], "children": [{...}, {...}] }
Since we model interactions between researchers we consider papers published by more than one author.
We take the year of publication into account — the more recent the publication is the more it contributes to connection between academic units.
We present results for each year separately and through the whole period 2008—2013.
We model the interactions on both school and faculty level.
We present two results for faculty interactions:year | Internal connection | ||
---|---|---|---|
2008 | FSCI | FMDY | FMVS |
2009 | FMDY | FSCI | FSSL |
2010 | FMDY | FSCI | FSSL |
2011 | FMDY | FSCI | FSSL |
2012 | FMDY | FSCI | FMVS |
2013 | FSCI | FMDY | FSSL |
total | FMDY | FSCI | FSSL |
REST | FENG | FMDY | FSCI | FOAT | INST | FMVS | FSSL | ||
---|---|---|---|---|---|---|---|---|---|
Total external connection | 1st | FENG | FENG | FMDY | FSCI | FOAT | FMDY | FMVS | FSSL |
2nd | FSCI | FOAT | FMVS | FOAT | FSCI | FSSL | FMDY | FMDY | |
3rd | FSSL | FMVS | FSCI | FMDY | FENG | FMVS | FENG | FSCI |
year | Internal connection | ||
---|---|---|---|
2008 | FMVS | FSSL | FMDY |
2009 | FMDY | FSSL | FMVS |
2010 | FMDY | FSSL | FENG |
2011 | FMDY | FSSL | FMVS |
2012 | FMVS | FMDY | FENG |
2013 | FSSL | FENG | FMDY |
total | FMDY | FSSL | FMVS |
REST | FENG | FMDY | FSCI | FOAT | INST | FMVS | FSSL | ||
---|---|---|---|---|---|---|---|---|---|
Total external connection |
1st | FENG | FENG | FMDY | FSCI | FOAT | FMDY | FMVS | FSSL |
2nd | REST | REST | FMVS | REST | FENG | FSSL | FMDY | REST | |
3rd | FSCI | FOAT | INST | FOAT | FSCI | FMVS | FENG | FMDY |
year | Internal connection | ||
---|---|---|---|
2008 | MVFS (0.04) | CHSE (0.03) | LAWD (0.02) |
2009 | CHSE (0.06) | PSYC (0.03) | MODL (0.03) |
2010 | CHSE (0.08) | PSYC (0.03) | SPOL (0.03) |
2011 | CHSE (0.05) | MVSF (0.04) | PSYC (0.03) |
2012 | MVSF (0.08) | CHSE (0.05) | PSYC (0.03) |
2013 | CHSE (0.05) | MODL (0.04) | PSYC (0.04) |
total | CHSE (0.31) | PSYC (0.20) | MVSF (0.15) |
INOV | CABI | MVSF | ENGF | VESC | PSYC | CHEM | LANG | BIOC | EDUC | PHPH | LAWD | HUMS | … | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Total external connection |
1st | MVEN | SSCM | BIOC | MVEN | CHSE | INOV | SCIF | SOCS | MVSF | ENGF | MVSF | SPOL | SART | … |
2nd | PSYC | PHPH | MSAD | EDUC | SSCM | REST | PHPH | VESC | BIOC | EFIM | SPAI | … | |||
3rd | QUEN | PANM | QUEN | BISC | BISC | MODL | PANM | MEED | SOCS | ORDS | MEED | … |
Network analysis is also possible but the network itself is difficult to visualise because of small differences in the connection strength.
The importance of each node in the network can be expressed by its centrality:
Centrality:
Closeness | SOCS (0.80) | SSCM (0.76) | CHEM (0.70) | QUEN (0.69) | … | MODL (0.41) | GSEN (0.41) | NSQI (0.39) |
---|---|---|---|---|---|---|---|---|
Betweenness | SSCM (0.14) | SOCS (0.13) | CHEM (0.12) | QUEN (0.09) | … | LANG (0.00) | CABI (0.00) | INOV (0.00) |
Eigenvector | ORDS (0.68) | MDYF (0.59) | CHSE (0.40) | SSCM (0.08) | … | GSEN (0.00) | LANG (0.00) | MODL (0.00) |
Classical clustering gives meaningless results on this data.
On the other hand, hierarchical clustering produces somehow meaningful results
(the data are thresholded by the school connection strength).
Depending how you cut this tree structure vertically you can receive different clusterings.
To discover possible collaborations between schools we model similarities between publication titles.
We do this with tf-idf and cosine distance therefore we can find similar papers and get their similarity score.
We threshold these (0.25) to get only relevant papers and then build connections between academic units based on authors associations.