Complete result tables
Labeled attachment score (LAS)
Unlabeled attachment score (UAS)
Label accuracy
Significance
for LAS
for UAS
for LAS when scoring punctuation
All official submissions of participants
in one tarball
Output of eval.pl
for concatenation of all submissions
Older, partial result tables
Average and standard deviation
Top three scores
Note: The content of the "name" and "affiliation" columns contains the information as provided by participants during results upload.
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | AV | SD | Bulgarian | Name | Affiliation | ||
| 57.64 | 78.37 | 60.92 | 77.90 | 74.59 | 77.56 | 87.41 | 77.42 | 59.19 | 68.32 | 79.15 | 51.07 | 70.80 | 11.11 | 78.74 | Sander Canisius, Antal van den Bosch, Erik Tjong Kim Sang, Toine Bogers, Jeroen Geertzen | Tilburg University | ||
| 53.81 | 54.89 | 59.76 | 66.35 | 58.24 | 69.77 | 65.38 | 75.36 | 57.19 | 67.44 | 68.77 | 37.80 | 61.23 | 9.92 | 72.89 | Giuseppe Attardi | Universita di Pisa | ||
| 63.81 | 74.81 | 59.36 | 78.38 | 68.45 | 76.52 | 90.11 | 81.47 | 67.83 | 72.99 | 71.72 | 55.09 | 71.71 | 9.67 | 79.73 | YuChieh Wu | National Central University | ||
| 60.94 | 83.68 | 68.82 | 79.74 | 67.25 | 82.41 | 88.13 | 83.37 | 68.43 | 77.16 | 78.65 | 58.06 | 74.72 | 9.72 | 83.30 | Xavier Carreras, Mihai Surdeanu, Lluís Màrquez | Technical University of Catalonia | ||
| 52.42 | 72.72 | 51.86 | 71.56 | 62.75 | 63.82 | 84.35 | 70.35 | 55.06 | 69.63 | 65.23 | 60.31 | 65.01 | 9.46 | 73.49 | Deniz Yuret | Koc University | ||
| 55.37 | 76.18 | 63.02 | 74.61 | 69.51 | 74.74 | 84.75 | 78.18 | 64.31 | 71.37 | 74.09 | 53.87 | 70.00 | 9.25 | 79.21 | Eckhard Bick | University of Southern Denmark | ||
| 66.71 | 86.92 | 78.42 | 84.77 | 78.59 | 85.82 | 91.65 | 87.60 | 70.30 | 81.29 | 84.58 | 65.68 | 80.19 | 8.53 | 87.41 | Joakim Nivre, Johan Hall, Jens Nilsson, Gülşen Eryiğit, Svetoslav Marinov | Växjö University, Istanbul Technical University, University of Skövde | ||
| 44.39 | 66.20 | 53.34 | 76.05 | 72.11 | 68.73 | 83.35 | 71.01 | 50.72 | 46.96 | 71.10 | 49.81 | 62.81 | 13.01 | 0.00 | Michael Schiehlen | IMS, Uni Stuttgart | ||
| 50.74 | 75.29 | 58.52 | 77.70 | 59.36 | 68.11 | 70.84 | 71.13 | 57.21 | 65.08 | 63.83 | 41.72 | 63.29 | 10.42 | 67.64 | Jinshan Ma | ? | ||
| 53.37 | 71.63 | 60.54 | 66.61 | 61.56 | 70.97 | 82.87 | 75.28 | 58.73 | 67.62 | 67.58 | 46.05 | 65.23 | 9.93 | 74.81 | Markus Dreyer, David A. Smith, and Noah A. Smith | Johns Hopkins University | ||
| 66.71 | 86.70 | 76.60 | 82.83 | 77.51 | 85.36 | 90.57 | 84.69 | 71.08 | 79.82 | 81.78 | 57.52 | 78.43 | 9.38 | 85.24 | John O'Neil | Basis Technology, Inc. | ||
| 60.92 | 85.05 | 72.88 | 80.60 | 72.91 | 84.17 | 89.07 | 83.99 | 69.52 | 79.72 | 82.31 | 60.51 | 76.80 | 9.43 | 0.00 | Quang Xuan Do, Ming-Wei Chang | University of Illinois at Urbana-Champaign | ||
| 64.29 | 72.49 | 71.46 | 81.54 | 72.67 | 80.43 | 85.63 | 84.57 | 66.43 | 78.16 | 78.13 | 63.39 | 74.93 | 7.65 | 0.00 | Richard Johansson and Pierre Nugues | Department of Computer Science, Lund University, Sweden | ||
| 66.91 | 85.90 | 80.18 | 84.79 | 79.19 | 87.34 | 90.71 | 86.82 | 73.44 | 82.25 | 82.55 | 63.19 | 80.27 | 8.43 | 87.57 | Ryan McDonald, Kevin Lerman, Fernando Pereira | University of Pennsylvania | ||
| 66.65 | 89.96 | 67.44 | 83.63 | 78.59 | 86.24 | 90.51 | 84.43 | 71.20 | 77.38 | 80.66 | 58.61 | 77.94 | 10.05 | 0.00 | Sebastian Riedel, Ivan Meza-Ruiz, Ruken Çakıcı | University of Edinburgh | ||
| 62.71 | 84.73 | 75.24 | 81.56 | 76.61 | 84.92 | 90.37 | 86.01 | 69.06 | 77.68 | 82.00 | 63.21 | 77.84 | 8.95 | 0.00 | Kenji Sagae | Carnegie Mellon University | ||
| 62.83 | 0.00 | 0.00 | 75.81 | 0.00 | 0.00 | 0.00 | 0.00 | 64.57 | 73.17 | 79.49 | 54.23 | 34.18 | 36.26 | 0.00 | Nobuyuki Shimizu | SUNY/Albany | ||
| 63.53 | 79.92 | 74.48 | 81.74 | 71.43 | 83.47 | 89.95 | 84.59 | 72.42 | 80.36 | 79.69 | 61.74 | 76.94 | 8.47 | 83.36 | Simon Corston-Oliver and Anthony Aue | Microsoft Research | ||
| 65.19 | 84.27 | 76.24 | 81.72 | 71.77 | 84.11 | 89.91 | 85.07 | 71.42 | 80.46 | 81.08 | 61.22 | 77.70 | 8.67 | 86.34 | Yuchang Cheng | Nara Institute of Science and Technology | ||
| AV | 59.94 | 78.32 | 67.17 | 78.31 | 70.73 | 78.58 | 85.86 | 80.63 | 65.16 | 73.52 | 76.44 | 55.95 | 79.98 | |||||
| SD | 6.53 | 8.82 | 8.93 | 5.45 | 6.66 | 7.51 | 7.09 | 5.83 | 6.78 | 8.41 | 6.46 | 7.71 | 6.30 |
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | AV | SD | Bulgarian | Name | Affiliation | ||
| 74.59 | 82.86 | 72.88 | 82.93 | 77.79 | 80.01 | 89.67 | 85.61 | 74.02 | 71.33 | 85.08 | 64.19 | 78.41 | 7.28 | 82.51 | Sander Canisius, Antal van den Bosch, Erik Tjong Kim Sang, Toine Bogers, Jeroen Geertzen | Tilburg University | ||
| 69.50 | 81.33 | 73.44 | 78.84 | 68.93 | 80.25 | 82.05 | 85.03 | 72.14 | 74.25 | 83.03 | 65.25 | 76.17 | 6.42 | 85.24 | Giuseppe Attardi | Universita di Pisa | ||
| 75.45 | 79.48 | 74.82 | 83.39 | 71.75 | 79.73 | 91.74 | 85.57 | 76.92 | 76.20 | 76.24 | 69.25 | 78.38 | 6.17 | 85.50 | YuChieh Wu | National Central University | ||
| 72.65 | 88.65 | 77.44 | 85.67 | 71.39 | 85.90 | 90.79 | 87.76 | 77.72 | 80.77 | 85.54 | 70.05 | 81.19 | 7.21 | 88.81 | Xavier Carreras, Mihai Surdeanu, Lluís Màrquez | Technical University of Catalonia | ||
| 68.82 | 78.37 | 66.36 | 78.16 | 66.17 | 67.71 | 87.31 | 79.46 | 70.60 | 73.89 | 73.25 | 71.54 | 73.47 | 6.36 | 78.56 | Deniz Yuret | Koc University | ||
| 68.98 | 83.06 | 72.24 | 80.54 | 74.47 | 79.79 | 87.85 | 84.29 | 75.06 | 75.76 | 82.65 | 65.50 | 77.52 | 6.66 | 84.16 | Eckhard Bick | University of Southern Denmark | ||
| 77.52 | 90.54 | 84.80 | 89.80 | 81.35 | 88.76 | 93.10 | 91.22 | 78.72 | 84.67 | 89.50 | 75.82 | 85.48 | 5.90 | 91.72 | Joakim Nivre, Johan Hall, Jens Nilsson, Gülşen Eryiğit, Svetoslav Marinov | Växjö University, Istanbul Technical University, University of Skövde | ||
| 62.63 | 74.87 | 66.86 | 81.94 | 75.59 | 72.64 | 86.71 | 81.27 | 68.45 | 53.18 | 79.69 | 61.58 | 72.12 | 9.87 | 0.00 | Michael Schiehlen | IMS, Uni Stuttgart | ||
| 64.79 | 79.90 | 68.14 | 79.90 | 64.07 | 73.00 | 72.64 | 77.10 | 68.94 | 70.07 | 73.19 | 56.90 | 70.72 | 6.77 | 73.97 | Jinshan Ma | ? | ||
| 68.46 | 77.63 | 70.74 | 77.45 | 68.33 | 76.98 | 85.97 | 82.41 | 72.88 | 72.85 | 79.53 | 60.45 | 74.47 | 6.98 | 81.95 | Markus Dreyer, David A. Smith, and Noah A. Smith | Johns Hopkins University | ||
| 78.54 | 90.64 | 85.58 | 88.78 | 81.73 | 89.16 | 93.16 | 89.70 | 81.71 | 84.11 | 88.45 | 72.02 | 85.30 | 6.00 | 90.72 | John O'Neil | Basis Technology, Inc. | ||
| 76.09 | 89.60 | 81.78 | 86.85 | 76.25 | 86.90 | 90.77 | 88.60 | 80.32 | 83.09 | 89.05 | 73.15 | 83.54 | 6.01 | 0.00 | Quang Xuan Do, Ming-Wei Chang | University of Illinois at Urbana-Champaign | ||
| 75.53 | 77.04 | 77.40 | 86.59 | 76.01 | 83.09 | 87.11 | 88.40 | 74.36 | 81.43 | 84.17 | 73.59 | 80.39 | 5.36 | 0.00 | Richard Johansson and Pierre Nugues | Department of Computer Science, Lund University, Sweden | ||
| 79.34 | 91.07 | 87.30 | 90.58 | 83.57 | 90.38 | 92.84 | 91.36 | 83.17 | 86.05 | 88.93 | 74.67 | 86.61 | 5.51 | 92.04 | Ryan McDonald, Kevin Lerman, Fernando Pereira | University of Pennsylvania | ||
| 78.62 | 93.18 | 77.32 | 89.66 | 82.91 | 89.76 | 92.96 | 89.42 | 83.17 | 81.05 | 88.33 | 74.07 | 85.04 | 6.38 | 0.00 | Sebastian Riedel, Ivan Meza-Ruiz, Ruken Çakıcı | University of Edinburgh | ||
| 74.11 | 89.64 | 82.64 | 86.53 | 80.71 | 87.92 | 92.20 | 89.78 | 78.02 | 81.13 | 88.57 | 73.31 | 83.71 | 6.35 | 0.00 | Kenji Sagae | Carnegie Mellon University | ||
| 74.27 | 0.00 | 0.00 | 81.72 | 0.00 | 0.00 | 0.00 | 0.00 | 74.88 | 77.58 | 86.62 | 68.77 | 38.65 | 40.59 | 0.00 | Nobuyuki Shimizu | SUNY/Albany | ||
| 78.40 | 90.00 | 83.02 | 87.94 | 74.83 | 87.20 | 92.84 | 88.96 | 81.77 | 84.87 | 89.54 | 73.11 | 84.37 | 6.28 | 90.09 | Simon Corston-Oliver and Anthony Aue | Microsoft Research | ||
| 77.74 | 89.46 | 83.40 | 88.64 | 75.49 | 87.66 | 93.12 | 90.30 | 81.14 | 85.15 | 88.57 | 74.49 | 84.60 | 6.15 | 91.30 | Yuchang Cheng | Nara Institute of Science and Technology | ||
| AV | 73.48 | 84.85 | 77.01 | 84.52 | 75.07 | 82.60 | 89.05 | 86.46 | 76.53 | 77.76 | 84.21 | 69.35 | 85.89 | |||||
| SD | 4.94 | 5.99 | 6.70 | 4.29 | 5.78 | 6.73 | 5.20 | 4.17 | 4.67 | 7.81 | 5.45 | 5.51 | 5.60 |
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | AV | SD | Bulgarian | Name | Affiliation | ||
| 70.38 | 80.85 | 70.46 | 84.09 | 80.07 | 86.78 | 90.69 | 81.85 | 69.44 | 81.95 | 82.45 | 64.41 | 78.62 | 8.01 | 83.28 | Sander Canisius, Antal van den Bosch, Erik Tjong Kim Sang, Toine Bogers, Jeroen Geertzen | Tilburg University | ||
| 72.97 | 58.75 | 69.84 | 74.65 | 66.47 | 77.68 | 73.68 | 80.79 | 69.36 | 82.19 | 72.42 | 49.81 | 70.72 | 9.12 | 77.68 | Giuseppe Attardi | Universita di Pisa | ||
| 77.66 | 78.91 | 65.90 | 84.81 | 73.49 | 82.97 | 93.58 | 87.16 | 78.22 | 85.17 | 73.91 | 67.22 | 79.08 | 8.18 | 83.58 | YuChieh Wu | National Central University | ||
| 78.36 | 86.12 | 78.74 | 85.57 | 77.31 | 89.42 | 92.04 | 88.74 | 81.10 | 88.72 | 82.83 | 73.41 | 83.53 | 5.79 | 87.49 | Xavier Carreras, Mihai Surdeanu, Lluís Màrquez | Technical University of Catalonia | ||
| 63.61 | 75.41 | 59.36 | 77.41 | 67.39 | 69.91 | 87.31 | 73.49 | 60.49 | 77.38 | 67.38 | 71.32 | 70.87 | 7.98 | 76.58 | Deniz Yuret | Koc University | ||
| 73.75 | 80.91 | 76.48 | 82.48 | 77.71 | 84.56 | 88.77 | 83.65 | 78.32 | 85.85 | 79.47 | 71.26 | 80.27 | 5.11 | 85.38 | Eckhard Bick | University of Southern Denmark | ||
| 80.34 | 89.01 | 85.40 | 89.16 | 83.69 | 91.03 | 94.34 | 91.54 | 80.54 | 90.06 | 87.39 | 78.49 | 86.75 | 5.05 | 90.44 | Joakim Nivre, Johan Hall, Jens Nilsson, Gülşen Eryiğit, Svetoslav Marinov | Växjö University, Istanbul Technical University, University of Skövde | ||
| 63.51 | 71.45 | 71.64 | 84.57 | 82.69 | 83.73 | 88.93 | 77.14 | 65.53 | 74.25 | 76.16 | 69.07 | 75.72 | 7.99 | 0.00 | Michael Schiehlen | IMS, Uni Stuttgart | ||
| 68.50 | 79.80 | 69.34 | 84.41 | 69.61 | 79.27 | 80.25 | 76.46 | 70.52 | 81.69 | 68.93 | 54.25 | 73.59 | 8.37 | 75.46 | Jinshan Ma | ? | ||
| 69.98 | 79.48 | 71.54 | 75.25 | 69.13 | 82.51 | 87.09 | 82.41 | 72.18 | 82.71 | 71.40 | 58.97 | 75.22 | 7.90 | 81.79 | Markus Dreyer, David A. Smith, and Noah A. Smith | Johns Hopkins University | ||
| 80.00 | 88.93 | 83.44 | 87.74 | 82.83 | 90.58 | 93.52 | 88.70 | 80.42 | 89.04 | 85.14 | 70.21 | 85.05 | 6.24 | 88.71 | John O'Neil | Basis Technology, Inc. | ||
| 75.69 | 87.28 | 80.42 | 86.51 | 80.15 | 91.03 | 92.18 | 88.84 | 79.26 | 89.26 | 84.82 | 73.75 | 84.10 | 6.10 | 0.00 | Quang Xuan Do, Ming-Wei Chang | University of Illinois at Urbana-Champaign | ||
| 79.06 | 77.10 | 82.14 | 87.17 | 81.15 | 89.10 | 89.51 | 89.42 | 80.70 | 88.42 | 83.21 | 77.63 | 83.72 | 4.77 | 0.00 | Richard Johansson and Pierre Nugues | Department of Computer Science, Lund University, Sweden | ||
| 79.50 | 88.23 | 86.72 | 89.22 | 83.89 | 92.11 | 93.74 | 90.46 | 82.51 | 90.40 | 85.58 | 77.45 | 86.65 | 5.04 | 90.70 | Ryan McDonald, Kevin Lerman, Fernando Pereira | University of Pennsylvania | ||
| 80.18 | 91.93 | 77.70 | 88.22 | 83.51 | 91.15 | 93.46 | 88.54 | 80.42 | 88.08 | 84.25 | 70.80 | 84.85 | 6.68 | 0.00 | Sebastian Riedel, Ivan Meza-Ruiz, Ruken Çakıcı | University of Edinburgh | ||
| 79.18 | 87.16 | 83.80 | 88.08 | 81.75 | 90.97 | 93.56 | 90.22 | 80.96 | 88.98 | 85.12 | 77.71 | 85.62 | 5.02 | 0.00 | Kenji Sagae | Carnegie Mellon University | ||
| 78.74 | 0.00 | 0.00 | 83.15 | 0.00 | 0.00 | 0.00 | 0.00 | 76.98 | 86.05 | 83.13 | 67.50 | 39.63 | 41.63 | 0.00 | Nobuyuki Shimizu | SUNY/Albany | ||
| 76.81 | 82.21 | 82.18 | 86.89 | 79.53 | 89.18 | 93.20 | 88.88 | 81.91 | 89.46 | 82.33 | 74.95 | 83.96 | 5.57 | 86.57 | Simon Corston-Oliver and Anthony Aue | Microsoft Research | ||
| 79.02 | 86.42 | 83.52 | 86.11 | 75.83 | 90.67 | 92.40 | 88.00 | 80.96 | 88.90 | 83.99 | 73.91 | 84.14 | 5.78 | 89.27 | Yuchang Cheng | Nara Institute of Science and Technology | ||
| AV | 75.12 | 81.66 | 76.59 | 84.50 | 77.57 | 86.26 | 89.90 | 85.35 | 76.31 | 85.71 | 80.00 | 69.59 | 84.38 | |||||
| SD | 5.49 | 7.92 | 7.69 | 4.35 | 5.92 | 6.01 | 5.36 | 5.45 | 6.40 | 4.56 | 6.24 | 7.94 | 5.23 |
Significance computed with version 1.8 of eval.pl and Dan Bikel's Randomized Parsing Evaluation Comparator (Statistical Significance Tester for evalb Output) (with a default 10,000 iterations). Differences taken to be significant if p<0.05.
Using the -p option of eval.pl
Lang12 80.27 80.19 78.43 77.94 77.84 77.70 76.94 76.80 74.94 74.72 71.71 70.79 70.00 65.23 65.00 63.29 62.82 61.23 34.20
1) McDonald 80.27 2) Nivre 80.19
Not significant (p = 0.34956504349565 ; diff = -0.0766040816325955 ; num = 3495)
1) McDonald 80.27 3) O'Neil 78.43
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.84425855893407 ; num = 0)
1) McDonald 80.27 4) Riedel 77.94
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.33402299042056 ; num = 0)
1) McDonald 80.27 5) Sagae 77.84
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.43062973760922 ; num = 0)
2) Nivre 80.19 3) O'Neil 78.43
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.76765447730148 ; num = 0)
3) O'Neil 78.43 4) Riedel 77.94
SIGNIFICANT (p = 0.0001999800019998 ; diff = -0.489764431486492 ; num = 1)
4) Riedel 77.94 5) Sagae 77.84
Not significant (p = 0.320567943205679 ; diff = -0.0966067471886589 ; num = 3205)
Arabic 66.91 66.71 66.71 66.65 65.19 64.29 63.81 63.53 62.83 62.71 60.94 60.92 57.64 55.37 53.81 53.37 52.42 50.74 44.39
1) McDonald 66.91 2) O'Neil 66.71
Not significant (p = 0.386261373862614 ; diff = -0.200438877755573 ; num = 3862)
1) McDonald 66.91 3) Nivre 66.71
Not significant (p = 0.401859814018598 ; diff = -0.200579158316685 ; num = 4018)
1) McDonald 66.91 4) Riedel 66.65
Not significant (p = 0.339466053394661 ; diff = -0.260196392785588 ; num = 3394)
1) McDonald 66.91 5) Cheng 65.19
SIGNIFICANT (p = 0.0075992400759924 ; diff = -1.72336472945896 ; num = 75)
2) O'Neil 66.71 3) Nivre 66.71
Not significant (p = 0.495850414958504 ; diff = -0.000140280561112149 ; num = 4958)
3) Nivre 66.71 4) Riedel 66.65
Not significant (p = 0.473252674732527 ; diff = -0.0596172344689023 ; num = 4732)
4) Riedel 66.65 5) Cheng 65.19
SIGNIFICANT (p = 0.0150984901509849 ; diff = -1.46316833667338 ; num = 150)
Bulgarian 87.57 87.41 86.34 85.24 83.36 83.30 79.73 79.21 78.74 74.81 73.49 72.89 67.64 0.00 0.00 0.00 0.00 0.00 0.00
1) McDonald 87.57 2) Nivre 87.41
Not significant (p = 0.396860313968603 ; diff = -0.159477358866937 ; num = 3968)
1) McDonald 87.57 3) Cheng 86.34
SIGNIFICANT (p = 0.0073992600739926 ; diff = -1.23680031917007 ; num = 73)
1) McDonald 87.57 4) O'Neil 85.24
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.33374027528419 ; num = 0)
1) McDonald 87.57 5) Corston-Oliver 83.36
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -4.20888489926183 ; num = 0)
2) Nivre 87.41 3) Cheng 86.34
SIGNIFICANT (p = 0.018998100189981 ; diff = -1.07732296030314 ; num = 189)
3) Cheng 86.34 4) O'Neil 85.24
SIGNIFICANT (p = 0.0118988101189881 ; diff = -1.09693995611411 ; num = 118)
4) O'Neil 85.24 5) Corston-Oliver 83.36
SIGNIFICANT (p = 0.0004999500049995 ; diff = -1.87514462397765 ; num = 4)
Chinese 89.96 86.92 86.70 85.90 85.05 84.73 84.27 83.68 79.92 78.37 76.18 75.29 74.81 72.72 72.49 71.63 66.20 54.89 0.00
1) Riedel 89.96 2) Nivre 86.92
SIGNIFICANT (p = 0.0001999800019998 ; diff = -3.03812072434619 ; num = 1)
1) Riedel 89.96 3) O'Neil 86.70
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -3.25949094567422 ; num = 0)
1) Riedel 89.96 4) McDonald 85.90
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -4.06434004024162 ; num = 0)
1) Riedel 89.96 5) Do/Chang 85.05
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -4.90931790744472 ; num = 0)
2) Nivre 86.92 3) O'Neil 86.70
Not significant (p = 0.358764123587641 ; diff = -0.22137022132803 ; num = 3587)
3) O'Neil 86.70 4) McDonald 85.90
SIGNIFICANT (p = 0.0444955504449555 ; diff = -0.804849094567402 ; num = 444)
4) McDonald 85.90 5) Do/Chang 85.05
Not significant (p = 0.107389261073893 ; diff = -0.844977867203099 ; num = 1073)
Czech 80.18 78.42 76.60 76.24 75.24 74.48 72.88 71.46 68.82 67.44 63.02 60.92 60.54 59.76 59.36 58.52 53.34 51.86 0.00
1) McDonald 80.18 2) Nivre 78.42
SIGNIFICANT (p = 0.0098990100989901 ; diff = -1.76019000000009 ; num = 98)
1) McDonald 80.18 3) O'Neil 76.60
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -3.58013599999998 ; num = 0)
1) McDonald 80.18 4) Cheng 76.24
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -3.94030400000008 ; num = 0)
1) McDonald 80.18 5) Sagae 75.24
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -4.94000200000001 ; num = 0)
2) Nivre 78.42 3) O'Neil 76.60
SIGNIFICANT (p = 0.00999900009999 ; diff = -1.81994599999989 ; num = 99)
3) O'Neil 76.60 4) Cheng 76.24
Not significant (p = 0.318268173182682 ; diff = -0.360168000000101 ; num = 3182)
4) Cheng 76.24 5) Sagae 75.24
Not significant (p = 0.126887311268873 ; diff = -0.999697999999924 ; num = 1268)
Danish 84.79 84.77 83.63 82.83 81.74 81.72 81.56 81.54 80.60 79.74 78.38 77.90 77.70 76.05 75.81 74.61 71.56 66.61 66.35
1) McDonald 84.79 2) Nivre 84.77
Not significant (p = 0.497650234976502 ; diff = -0.0196606786427225 ; num = 4976)
1) McDonald 84.79 3) Riedel 83.63
SIGNIFICANT (p = 0.0047995200479952 ; diff = -1.15756087824353 ; num = 47)
1) McDonald 84.79 4) O'Neil 82.83
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.95614171656689 ; num = 0)
1) McDonald 84.79 5) Corston-Oliver 81.74
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -3.05375049900198 ; num = 0)
2) Nivre 84.77 3) Riedel 83.63
SIGNIFICANT (p = 0.0271972802719728 ; diff = -1.13790019960081 ; num = 271)
3) Riedel 83.63 4) O'Neil 82.83
SIGNIFICANT (p = 0.0097990200979902 ; diff = -0.79858083832336 ; num = 97)
4) O'Neil 82.83 5) Corston-Oliver 81.74
SIGNIFICANT (p = 0.012998700129987 ; diff = -1.09760878243509 ; num = 129)
Dutch 79.19 78.59 78.59 77.51 76.61 74.59 72.91 72.67 72.11 71.77 71.43 69.51 68.45 67.25 62.75 61.56 59.36 58.24 0.00
1) McDonald 79.19 2) Nivre 78.59
Not significant (p = 0.204979502049795 ; diff = -0.600554221688768 ; num = 2049)
1) McDonald 79.19 3) Riedel 78.59
Not significant (p = 0.131986801319868 ; diff = -0.600422168867652 ; num = 1319)
1) McDonald 79.19 4) O'Neil 77.51
SIGNIFICANT (p = 0.0014998500149985 ; diff = -1.68097839135656 ; num = 14)
1) McDonald 79.19 5) Sagae 76.61
SIGNIFICANT (p = 0.0001999800019998 ; diff = -2.58084633853545 ; num = 1)
2) Nivre 78.59 3) Riedel 78.59
Not significant (p = 0.495750424957504 ; diff = 0.00013205282111528 ; num = 4957)
3) Riedel 78.59 4) O'Neil 77.51
SIGNIFICANT (p = 0.0063993600639936 ; diff = -1.08055622248891 ; num = 63)
4) O'Neil 77.51 5) Sagae 76.61
Not significant (p = 0.113688631136886 ; diff = -0.89986794717889 ; num = 1136)
German 87.34 86.24 85.82 85.36 84.92 84.17 84.11 83.47 82.41 80.43 77.56 76.52 74.74 70.97 69.77 68.73 68.11 63.82 0.00
1) McDonald 87.34 2) Riedel 86.24
SIGNIFICANT (p = 0.0094990500949905 ; diff = -1.09809904153353 ; num = 94)
1) McDonald 87.34 3) Nivre 85.82
SIGNIFICANT (p = 0.0053994600539946 ; diff = -1.51769968051114 ; num = 53)
1) McDonald 87.34 4) O'Neil 85.36
SIGNIFICANT (p = 0.0001999800019998 ; diff = -1.97680910543129 ; num = 1)
1) McDonald 87.34 5) Sagae 84.92
SIGNIFICANT (p = 0.0003999600039996 ; diff = -2.41611821086259 ; num = 3)
2) Riedel 86.24 3) Nivre 85.82
Not significant (p = 0.244675532446755 ; diff = -0.419600638977613 ; num = 2446)
3) Nivre 85.82 4) O'Neil 85.36
Not significant (p = 0.240975902409759 ; diff = -0.459109424920143 ; num = 2409)
4) O'Neil 85.36 5) Sagae 84.92
Not significant (p = 0.257374262573743 ; diff = -0.439309105431306 ; num = 2573)
Japanese 91.65 90.71 90.57 90.51 90.37 90.11 89.95 89.91 89.07 88.13 87.41 85.63 84.75 84.35 83.35 82.87 70.84 65.38 0.00
1) Nivre 91.65 2) McDonald 90.71
SIGNIFICANT (p = 0.0054994500549945 ; diff = -0.939738157105793 ; num = 54)
1) Nivre 91.65 3) O'Neil 90.57
SIGNIFICANT (p = 0.0003999600039996 ; diff = -1.07970217869271 ; num = 3)
1) Nivre 91.65 4) Riedel 90.51
SIGNIFICANT (p = 0.0010998900109989 ; diff = -1.13957025784526 ; num = 10)
1) Nivre 91.65 5) Sagae 90.37
SIGNIFICANT (p = 0.0001999800019998 ; diff = -1.27942034779134 ; num = 1)
2) McDonald 90.71 3) O'Neil 90.57
Not significant (p = 0.311968803119688 ; diff = -0.13996402158692 ; num = 3119)
3) O'Neil 90.57 4) Riedel 90.51
Not significant (p = 0.406759324067593 ; diff = -0.059868079152551 ; num = 4067)
4) Riedel 90.51 5) Sagae 90.37
Not significant (p = 0.359264073592641 ; diff = -0.13985008994608 ; num = 3592)
Portuguese 87.60 86.82 86.01 85.07 84.69 84.59 84.57 84.43 83.99 83.37 81.47 78.18 77.42 75.36 75.28 71.13 71.01 70.35 0.00
1) Nivre 87.60 2) McDonald 86.82
Not significant (p = 0.0871912808719128 ; diff = -0.778947893791141 ; num = 871)
1) Nivre 87.60 3) Sagae 86.01
SIGNIFICANT (p = 0.0004999500049995 ; diff = -1.59745258534642 ; num = 4)
1) Nivre 87.60 4) Cheng 85.07
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.53578758235173 ; num = 0)
1) Nivre 87.60 5) O'Neil 84.69
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.91475144739466 ; num = 0)
2) McDonald 86.82 3) Sagae 86.01
Not significant (p = 0.0687931206879312 ; diff = -0.818504691555276 ; num = 687)
3) Sagae 86.01 4) Cheng 85.07
SIGNIFICANT (p = 0.0465953404659534 ; diff = -0.938334997005313 ; num = 465)
4) Cheng 85.07 5) O'Neil 84.69
Not significant (p = 0.245475452454755 ; diff = -0.378963865042934 ; num = 2454)
Slovene 73.44 72.42 71.42 71.20 71.08 70.30 69.52 69.06 68.43 67.83 66.43 64.57 64.31 59.19 58.73 57.21 57.19 55.06 50.72
1) McDonald 73.44 2) Corston-Oliver 72.42
Not significant (p = 0.0628937106289371 ; diff = -1.01914468425265 ; num = 628)
1) McDonald 73.44 3) Cheng 71.42
SIGNIFICANT (p = 0.0033996600339966 ; diff = -2.01796962430059 ; num = 33)
1) McDonald 73.44 4) Riedel 71.20
SIGNIFICANT (p = 0.0001999800019998 ; diff = -2.23829536370904 ; num = 1)
1) McDonald 73.44 5) O'Neil 71.08
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.35802158273381 ; num = 0)
2) Corston-Oliver 72.42 3) Cheng 71.42
Not significant (p = 0.0761923807619238 ; diff = -0.998824940047939 ; num = 761)
3) Cheng 71.42 4) Riedel 71.20
Not significant (p = 0.365663433656634 ; diff = -0.220325739408452 ; num = 3656)
4) Riedel 71.20 5) O'Neil 71.08
Not significant (p = 0.406159384061594 ; diff = -0.11972621902477 ; num = 4061)
Spanish 82.25 81.29 80.46 80.36 79.82 79.72 78.16 77.68 77.38 77.16 73.17 72.99 71.37 69.63 68.32 67.62 67.44 65.08 46.96
1) McDonald 82.25 2) Nivre 81.29
Not significant (p = 0.107989201079892 ; diff = -0.961574834702475 ; num = 1079)
1) McDonald 82.25 3) Cheng 80.46
SIGNIFICANT (p = 0.0120987901209879 ; diff = -1.78294730514931 ; num = 120)
1) McDonald 82.25 4) Corston-Oliver 80.36
SIGNIFICANT (p = 0.0002999700029997 ; diff = -1.88341614906835 ; num = 2)
1) McDonald 82.25 5) O'Neil 79.82
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.42434381887402 ; num = 0)
2) Nivre 81.29 3) Cheng 80.46
Not significant (p = 0.142285771422858 ; diff = -0.821372470446832 ; num = 1422)
3) Cheng 80.46 4) Corston-Oliver 80.36
Not significant (p = 0.433556644335566 ; diff = -0.100468843919046 ; num = 4335)
4) Corston-Oliver 80.36 5) O'Neil 79.82
Not significant (p = 0.141585841415858 ; diff = -0.540927669805669 ; num = 1415)
Swedish 84.58 82.55 82.31 82.00 81.78 81.08 80.66 79.69 79.49 79.15 78.65 78.13 74.09 71.72 71.10 68.77 67.58 65.23 63.83
1) Nivre 84.58 2) McDonald 82.55
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.03136825333601 ; num = 0)
1) Nivre 84.58 3) Do/Chang 82.31
SIGNIFICANT (p = 0.0001999800019998 ; diff = -2.27051782513445 ; num = 1)
1) Nivre 84.58 4) Sagae 82.00
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.58940848436569 ; num = 0)
1) Nivre 84.58 5) O'Neil 81.78
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.8082772356105 ; num = 0)
2) McDonald 82.55 3) Do/Chang 82.31
Not significant (p = 0.300969903009699 ; diff = -0.239149571798436 ; num = 3009)
3) Do/Chang 82.31 4) Sagae 82.00
Not significant (p = 0.267673232676732 ; diff = -0.318890659231243 ; num = 2676)
4) Sagae 82.00 5) O'Neil 81.78
Not significant (p = 0.347065293470653 ; diff = -0.218868751244813 ; num = 3470)
Turkish 65.68 63.39 63.21 63.19 61.74 61.22 60.51 60.31 58.61 58.06 57.52 55.09 54.23 53.87 51.07 49.81 46.05 41.72 37.80
1) Nivre 65.68 2) Johansson 63.39
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.29034455287789 ; num = 0)
1) Nivre 65.68 3) Sagae 63.21
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.46960963951408 ; num = 0)
1) Nivre 65.68 4) McDonald 63.19
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -2.48964150567618 ; num = 0)
1) Nivre 65.68 5) Corston-Oliver 61.74
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -3.94360087631949 ; num = 0)
2) Johansson 63.39 3) Sagae 63.21
Not significant (p = 0.385361463853615 ; diff = -0.179265086636192 ; num = 3853)
3) Sagae 63.21 4) McDonald 63.19
Not significant (p = 0.467853214678532 ; diff = -0.0200318661620997 ; num = 4678)
4) McDonald 63.19 5) Corston-Oliver 61.74
SIGNIFICANT (p = 0.00999900009999 ; diff = -1.4539593706433 ; num = 99)
Lang12 86.60 85.48 85.30 85.03 84.59 84.37 83.71 83.54 81.19 80.40 78.41 78.38 77.51 76.17 74.47 73.47 72.12 70.72 38.69
1) McDonald 86.60 2) Nivre 85.48
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -0.951225420645528 ; num = 0)
2) Nivre 85.48 3) O'Neil 85.30
SIGNIFICANT (p = 0.0070992900709929 ; diff = -0.36026162867816 ; num = 70)
Arabic 79.34 78.62 78.54 78.40 77.74 77.52 76.09 75.53 75.45 74.59 74.27 74.11 72.65 69.50 68.98 68.82 68.46 64.79 62.63
1) McDonald 79.34 2) Riedel 78.62
Not significant (p = 0.0705929407059294 ; diff = -0.674653609909583 ; num = 705)
2) Riedel 78.62 3) O'Neil 78.54
Not significant (p = 0.322267773222678 ; diff = -0.154388122779622 ; num = 3222)
Bulgarian 92.04 91.72 91.30 90.72 90.09 88.81 85.50 85.24 84.16 82.51 81.95 78.56 73.97 0.00 0.00 0.00 0.00 0.00 0.00
1) McDonald 92.04 2) Nivre 91.72
Not significant (p = 0.297770222977702 ; diff = -0.234827800413541 ; num = 2977)
2) Nivre 91.72 3) Cheng 91.30
Not significant (p = 0.177182281771823 ; diff = -0.36526162101579 ; num = 1771)
Chinese 93.18 91.07 90.64 90.54 90.00 89.64 89.60 89.46 88.65 83.06 82.86 81.33 79.90 79.48 78.37 77.63 77.04 74.87 0.00
1) Riedel 93.18 2) McDonald 91.07
SIGNIFICANT (p = 0.0005999400059994 ; diff = -1.21794846924494 ; num = 5)
2) McDonald 91.07 3) O'Neil 90.64
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.40016338788998 ; num = 0)
Czech 87.30 85.58 84.80 83.40 83.02 82.64 81.78 77.44 77.40 77.32 74.82 73.44 72.88 72.24 70.74 68.14 66.86 66.36 0.00
1) McDonald 87.30 2) O'Neil 85.58
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.66345250666407 ; num = 0)
2) O'Neil 85.58 3) Nivre 84.80
Not significant (p = 0.301869813018698 ; diff = -0.291146555036377 ; num = 3018)
Danish 90.58 89.80 89.66 88.78 88.64 87.94 86.85 86.59 86.53 85.67 83.39 82.93 81.94 81.72 80.54 79.90 78.84 78.16 77.45
1) McDonald 90.58 2) Nivre 89.80
Not significant (p = 0.0771922807719228 ; diff = -0.589205302003393 ; num = 771)
2) Nivre 89.80 3) Riedel 89.66
Not significant (p = 0.313468653134687 ; diff = -0.219810263934804 ; num = 3134)
Dutch 83.57 82.91 81.73 81.35 80.71 77.79 76.25 76.01 75.59 75.49 74.83 74.47 71.75 71.39 68.93 68.33 66.17 64.07 0.00
1) McDonald 83.57 2) Riedel 82.91
Not significant (p = 0.0602939706029397 ; diff = -0.676567763708292 ; num = 602)
2) Riedel 82.91 3) O'Neil 81.73
SIGNIFICANT (p = 0.0006999300069993 ; diff = -1.30306230659862 ; num = 6)
German 90.38 89.76 89.16 88.76 87.92 87.66 87.20 86.90 85.90 83.09 80.25 80.01 79.79 79.73 76.98 73.00 72.64 67.71 0.00
1) McDonald 90.38 2) Riedel 89.76
SIGNIFICANT (p = 0.042995700429957 ; diff = -0.634872872209115 ; num = 429)
2) Riedel 89.76 3) O'Neil 89.16
SIGNIFICANT (p = 0.0024997500249975 ; diff = -0.554039100452172 ; num = 24)
Japanese 93.16 93.12 93.10 92.96 92.84 92.84 92.20 91.74 90.79 90.77 89.67 87.85 87.31 87.11 86.71 85.97 82.05 72.64 0.00
1) O'Neil 93.16 2) Cheng 93.12
Not significant (p = 0.447655234476552 ; diff = 0.0444399511914781 ; num = 4476)
2) Cheng 93.12 3) Nivre 93.10
Not significant (p = 0.490650934906509 ; diff = -0.00754089712938821 ; num = 4906)
Portuguese 91.36 91.22 90.30 89.78 89.70 89.42 88.96 88.60 88.40 87.76 85.61 85.57 85.03 84.29 82.41 81.27 79.46 77.10 0.00
1) McDonald 91.36 2) Nivre 91.22
Not significant (p = 0.501849815018498 ; diff = 0.00421041635529207 ; num = 5018)
2) Nivre 91.22 3) Cheng 90.30
SIGNIFICANT (p = 0.0196980301969803 ; diff = -0.91647187546333 ; num = 196)
Slovene 83.17 83.17 81.77 81.71 81.14 80.32 78.72 78.02 77.72 76.92 75.06 74.88 74.36 74.02 72.88 72.14 70.60 68.94 68.45
1) McDonald 83.17 2) Riedel 83.17
Not significant (p = 0.132386761323868 ; diff = -0.471202811969547 ; num = 1323)
2) Riedel 83.17 3) Corston-Oliver 81.77
SIGNIFICANT (p = 0.0258974102589741 ; diff = -0.905599548709873 ; num = 258)
Spanish 86.05 85.15 84.87 84.67 84.11 83.09 81.43 81.13 81.05 80.77 77.58 76.20 75.76 74.25 73.89 72.85 71.33 70.07 53.18
1) McDonald 86.05 2) Cheng 85.15
Not significant (p = 0.0724927507249275 ; diff = -1.00142453875124 ; num = 724)
2) Cheng 85.15 3) Corston-Oliver 84.87
Not significant (p = 0.31006899310069 ; diff = -0.337954497765551 ; num = 3100)
Swedish 89.54 89.50 89.05 88.93 88.57 88.57 88.45 88.33 86.62 85.54 85.08 84.17 83.03 82.65 79.69 79.53 76.24 73.25 73.19
1) Corston-Oliver 89.54 2) Nivre 89.50
Not significant (p = 0.413258674132587 ; diff = 0.107872146044144 ; num = 4132)
2) Nivre 89.50 3) Do/Chang 89.05
Not significant (p = 0.221577842215778 ; diff = -0.343394880410116 ; num = 2215)
Turkish 75.82 74.67 74.49 74.07 73.59 73.31 73.15 73.11 72.02 71.54 70.05 69.25 68.77 65.50 65.25 64.19 61.58 60.45 56.90
1) Nivre 75.82 2) McDonald 74.67
Not significant (p = 0.188081191880812 ; diff = -0.467515567314209 ; num = 1880)
2) McDonald 74.67 3) Cheng 74.49
Not significant (p = 0.102389761023898 ; diff = -0.670985862743649 ; num = 1023)
Lang12 80.56 80.23 78.66 78.27 78.22 78.16 77.57 76.33 75.23 75.08 71.05 69.73 69.13 63.55 63.50 62.43 61.69 58.39 36.00
1) McDonald 80.56 2) Nivre 80.23
SIGNIFICANT (p = 0.0460953904609539 ; diff = -0.326017028789465 ; num = 460)
2) Nivre 80.23 3) O'Neil 78.66
SIGNIFICANT (p = 9.99900009999e-05 ; diff = -1.57469131759578 ; num = 0)
Arabic 67.00 66.98 66.95 66.69 65.40 64.82 64.40 63.67 63.15 63.09 61.42 59.07 56.23 53.42 52.91 51.98 50.79 50.47 43.40
1) O'Neil 67.00 2) Nivre 66.98
Not significant (p = 0.4995500449955 ; diff = -0.0183044853899474 ; num = 4995)
2) Nivre 66.98 3) McDonald 66.95
Not significant (p = 0.486051394860514 ; diff = -0.0371207891308387 ; num = 4860)
Bulgarian 88.07 88.05 86.97 85.51 83.99 83.60 78.90 76.88 76.53 74.67 71.17 69.68 64.93 0.00 0.00 0.00 0.00 0.00 0.00
1) McDonald 88.07 2) Nivre 88.05
Not significant (p = 0.478152184781522 ; diff = -0.0169396697000792 ; num = 4781)
2) Nivre 88.05 3) Cheng 86.97
SIGNIFICANT (p = 0.0242975702429757 ; diff = -1.0784378159757 ; num = 242)
Chinese 89.96 86.95 86.65 85.85 85.08 84.74 84.18 83.60 79.93 78.35 76.20 75.26 74.72 72.55 72.53 71.69 66.10 55.05 0.00
1) Riedel 89.96 2) Nivre 86.95
SIGNIFICANT (p = 0.0001999800019998 ; diff = -3.01265562649654 ; num = 1)
2) Nivre 86.95 3) O'Neil 86.65
Not significant (p = 0.303569643035696 ; diff = -0.299319632880994 ; num = 3035)
Czech 80.23 77.60 76.99 76.68 75.79 75.12 71.04 69.84 68.41 67.95 64.14 60.81 59.34 57.15 55.56 55.48 49.92 47.72 0.00
1) McDonald 80.23 2) Nivre 77.60
SIGNIFICANT (p = 0.0003999600039996 ; diff = -2.63110029044938 ; num = 3)
2) Nivre 77.60 3) O'Neil 76.99
Not significant (p = 0.200679932006799 ; diff = -0.615270801298522 ; num = 2006)
Danish 84.71 84.59 83.10 81.99 81.63 81.51 81.49 81.07 80.64 78.81 76.06 75.15 74.90 73.22 73.12 69.26 68.69 66.92 60.56
1) Nivre 84.71 2) McDonald 84.59
Not significant (p = 0.411758824117588 ; diff = -0.119760765550254 ; num = 4117)
2) McDonald 84.59 3) Riedel 83.10
SIGNIFICANT (p = 0.0005999400059994 ; diff = -1.48636021872862 ; num = 5)
Dutch 81.15 80.63 80.61 79.70 78.84 77.06 75.52 75.33 74.50 74.09 72.44 71.57 70.33 66.48 65.78 65.19 62.04 60.14 0.00
1) McDonald 81.15 2) Riedel 80.63
Not significant (p = 0.147685231476852 ; diff = -0.519108325872935 ; num = 1476)
2) Riedel 80.63 3) Nivre 80.61
Not significant (p = 0.479052094790521 ; diff = -0.0180931065353747 ; num = 4790)
German 87.30 86.00 85.55 85.21 84.47 84.11 83.83 83.28 81.91 79.84 75.80 74.55 74.08 70.09 67.76 66.40 64.65 61.03 0.00
1) McDonald 87.30 2) Riedel 86.00
SIGNIFICANT (p = 0.0020997900209979 ; diff = -1.29961362838077 ; num = 20)
2) Riedel 86.00 3) Nivre 85.55
Not significant (p = 0.237776222377762 ; diff = -0.456520899192128 ; num = 2377)
Japanese 92.68 91.86 91.74 91.68 91.56 91.33 91.19 91.16 89.62 89.56 88.97 87.20 86.64 86.29 84.99 74.42 73.05 69.64 0.00
1) Nivre 92.68 2) McDonald 91.86
SIGNIFICANT (p = 0.0057994200579942 ; diff = -0.823055506916504 ; num = 57)
2) McDonald 91.86 3) O'Neil 91.74
Not significant (p = 0.302569743025697 ; diff = -0.122771843810114 ; num = 3025)
Portuguese 85.60 84.93 83.98 83.04 82.92 82.32 82.27 82.09 81.83 81.01 78.42 74.06 73.34 68.64 66.75 65.47 65.37 64.84 0.00
1) Nivre 85.60 2) McDonald 84.93
Not significant (p = 0.177482251774823 ; diff = -0.664663371399456 ; num = 1774)
2) McDonald 84.93 3) Sagae 83.98
Not significant (p = 0.083991600839916 ; diff = -0.954567922277093 ; num = 839)
Slovene 72.39 71.86 70.81 70.25 69.95 68.39 68.31 67.76 65.52 65.26 64.59 63.33 61.17 55.54 54.95 54.63 52.94 49.01 47.31
1) McDonald 72.39 2) Corston-Oliver 71.86
Not significant (p = 0.22027797220278 ; diff = -0.531801251956097 ; num = 2202)
2) Corston-Oliver 71.86 3) Cheng 70.81
Not significant (p = 0.076992300769923 ; diff = -1.04879655712051 ; num = 769)
Spanish 80.61 79.47 78.63 78.54 77.99 77.96 76.26 75.78 75.76 74.46 70.86 69.30 68.34 65.19 65.19 64.96 62.33 60.98 42.41
1) McDonald 80.61 2) Nivre 79.47
Not significant (p = 0.0891910808919108 ; diff = -1.14164207938182 ; num = 891)
2) Nivre 79.47 3) Corston-Oliver 78.63
Not significant (p = 0.141985801419858 ; diff = -0.843170003512483 ; num = 1419)
Swedish 83.89 82.32 82.02 81.72 80.99 80.92 80.69 79.76 79.51 78.52 76.98 76.93 72.01 68.90 67.68 64.64 62.94 61.69 60.70
1) Nivre 83.89 2) McDonald 82.32
SIGNIFICANT (p = 0.0018998100189981 ; diff = -1.57379243281468 ; num = 18)
2) McDonald 82.32 3) Do/Chang 82.02
Not significant (p = 0.302569743025697 ; diff = -0.300456152758159 ; num = 3025)
Turkish 73.78 71.96 71.31 70.96 70.93 70.69 70.01 68.82 68.82 68.57 68.16 65.95 65.28 64.00 62.18 53.80 52.97 51.40 51.23
1) Nivre 73.78 2) McDonald 71.96
SIGNIFICANT (p = 0.0002999700029997 ; diff = -1.81538094607124 ; num = 2)
2) McDonald 71.96 3) Sagae 71.31
Not significant (p = 0.107489251074893 ; diff = -0.648934676030152 ; num = 1074)
Download online_results.tar.bz2 (1.3 MB)
These files are tarred and zipped using bzip2, and can be unpacked with either 'tar xjf filename' or 'tar xyf filename' (depending on the your version of tar). This will create a directory online_results, which contains files named <participant-name>_<language-name>.txt
Each file contains the HEAD and the DEPREL column as submitted by that shared task participant for the test data of that language. Due to licensing restrictions for most of the data, we cannot include the other columns of the test data but if you have the test data, you can easily restore the original submission files with a UNIX tools like "paste".
We hope that these files will be useful for researchers trying to reproduce the shared task experiments for future comparison, and for those working on parser combinations.
Prior to calling eval.pl, gold standard and predicted DEPREL values have been suffixed with a two-letter language prefix, e.g. the German DEPREL value 'APPR' becomes 'APPR_ge'. Output of eval.pl: everybody_allLangs.eval
AV: average; SD: standard deviation
| Labeled attachment score | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | Bulgarian | |
| AV | 59.94 | 78.32 | 67.17 | 78.31 | 70.73 | 78.58 | 85.86 | 80.63 | 65.16 | 73.52 | 76.44 | 55.95 | 79.98 |
| SD | 6.53 | 8.82 | 8.93 | 5.45 | 6.66 | 7.51 | 7.09 | 5.83 | 6.78 | 8.41 | 6.46 | 7.71 | 6.30 |
| Unlabeled attachment score | |||||||||||||
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | Bulgarian | |
| AV | 73.48 | 84.85 | 77.01 | 84.52 | 75.07 | 82.60 | 89.05 | 86.46 | 76.53 | 77.76 | 84.21 | 69.35 | 85.89 |
| SD | 4.94 | 5.99 | 6.70 | 4.29 | 5.78 | 6.73 | 5.20 | 4.17 | 4.67 | 7.81 | 5.45 | 5.51 | 5.60 |
| Label accuracy | |||||||||||||
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | Bulgarian | |
| AV | 75.12 | 81.66 | 76.59 | 84.50 | 77.57 | 86.26 | 89.90 | 85.35 | 76.31 | 85.71 | 80.00 | 69.59 | 84.38 |
| SD | 5.49 | 7.92 | 7.69 | 4.35 | 5.92 | 6.01 | 5.36 | 5.45 | 6.40 | 4.56 | 6.24 | 7.94 | 5.23 |
Significance computed with version 1.8 of eval.pl and Dan Bikel's Randomized Parsing Evaluation Comparator (Statistical Significance Tester for evalb Output) (with a default 10,000 iterations). Differences taken to be significant if p<0.05.
Clarification: "12 l." means how a system did on all 12 required languages. So the top three for "12 l." are the scores of the three systems that performed best overall (the "winners").
Technically, it is not the average of the scores of a system on each language but the total. I computed it by concatenating all 12 gold standard files and, for each system, all twelve submissions (with dummy submissions of zero accuracy for those few files that somebody failed to submit) and then applying eval.pl to these two concatenated files. Because our test sets are all roughly the same size (5000 scoring tokens) the score one gets when one does this concatenation is at most 0.01% different from what one would get by computing the average of the individual language scores. So this difference is irrelevant for the ranking. We can only compute significance of differences in the concatenated results, not of differences in the averages, hence this subtle distinction.
| Top three: labeled attachment | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | 12 l. | Bulgarian | |
| 1st | 66.91 | 89.96 | 80.18 | 84.79 | 79.19 | 87.34 | 91.65 | 87.60 | 73.44 | 82.25 | 84.58 | 65.68 | 80.27 | 87.57 |
| 2nd | 66.71 | 86.92 | 78.42 | 84.77 | 78.59 | 86.24 | 90.71 | 86.82 | 72.42 | 81.29 | 82.55 | 63.39 | 80.19 | 87.41 |
| 3rd | 66.71 | 86.70 | 76.60 | 83.63 | 78.59 | 85.82 | 90.57 | 86.01 | 71.42 | 80.46 | 82.31 | 63.21 | 78.43 | 86.34 |
| Difference 1st to 2nd significant? (p = ...) | ||||||||||||||
| No | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | No | No | |
| 0.382 | 0.000 | 0.011 | 0.500 | 0.196 | 0.010 | 0.004 | 0.090 | 0.060 | 0.112 | 0.000 | 0.000 | 0.348 | 0.401 | |
| Difference 2nd to 3rd significant? (p = ...) | ||||||||||||||
| No | No | Yes | Yes | No | No | No | No | No | No | No | No | Yes | Yes | |
| 0.493 | 0.349 | 0.008 | 0.023 | 0.495 | 0.246 | 0.322 | 0.073 | 0.076 | 0.144 | 0.314 | 0.380 | 0.000 | 0.017 | |
| Top three: unlabeled attachment | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | 12 l. | Bulgarian | |
| 1st | 79.34 | 93.18 | 87.30 | 90.58 | 83.57 | 90.38 | 93.16 | 91.36 | 83.17 | 86.05 | 89.54 | 75.82 | 86.60 | 92.04 |
| 2nd | 78.62 | 91.07 | 85.58 | 89.80 | 82.91 | 89.76 | 93.12 | 91.22 | 83.17 | 85.15 | 89.50 | 74.67 | 85.48 | 91.72 |
| 3rd | 78.54 | 90.64 | 84.80 | 89.66 | 81.73 | 89.16 | 93.10 | 90.30 | 81.77 | 84.87 | 89.05 | 74.49 | 85.30 | 91.30 |
| Top three: label accuracy | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Arabic | Chinese | Czech | Danish | Dutch | German | Japanese | Portuguese | Slovene | Spanish | Swedish | Turkish | 12 l. | Bulgarian | |
| 1st | 80.34 | 91.93 | 86.72 | 89.22 | 83.89 | 92.11 | 94.34 | 91.54 | 82.51 | 90.40 | 87.39 | 78.49 | 86.75 | 90.70 |
| 2nd | 80.18 | 89.01 | 85.40 | 89.16 | 83.69 | 91.15 | 93.74 | 90.46 | 81.91 | 90.06 | 85.58 | 77.71 | 86.65 | 90.44 |
| 3rd | 80.00 | 88.93 | 83.80 | 88.22 | 83.51 | 91.03 | 93.58 | 90.22 | 81.10 | 89.46 | 85.14 | 77.63 | 85.62 | 89.27 |