Here is a detailed statistical analysis of Indian language Wikipedias for the month of 2010 January .
The PDF of this analysis is available at http://shijualexonline.googlepages.com/2010_01_january_en.pdf. Please use the PDF for referring to the tables as the size of the tables in blog post is quite lengthy.
The data for this report is taken from the statistical analysis of all the WikiMedia wikis prepared and maintained by Erik Zachte (Website: http://infodisiac.com/). Special thanks to Erik for all the support extended by him while I was compiling this report. The data is collected on the last day of every month. That is, the data for the month of 2010 January is collected at 2010 January 31 23:59 PM GMT.
The statistical analysis of the following Indian language Wikipedias is included in this blog post.
Assamese (http://as.wikipedia.org)
Bengali (http://bn.wikipedia.org)
Bhojpuri (http://bh.wikipedia.org)
Bishnupriya Manipuri (http://bpy.wikipedia.org)
Gujarathi (http://gu.wikipedia.org)
Hindi (http://hi.wikipedia.org)
Kannada (http://kn.wikipedia.org)
Kashmiri (http://ks.wikipedia.org)
Malayalam (http://ml.wikipedia.org)
Marathi (http://mr.wikipedia.org)
Odia (Oriya) (http://or.wikipedia.org)
Pali (http://pi.wikipedia.org)
Punjabi (http://pa.wikipedia.org)
Sanskrit (http://sa.wikipedia.org)
Sindhi (http://sd.wikipedia.org)
Tamil (http://ta.wikipedia.org)
Telugu (http://te.wikipedia.org)
Urdu (http://ur.wikipedia.org)
I have also included the statistical analysis of some other language wikipedias from Indian subcontinent even though these languages are not spoken in India. I am very much interested in the wiki activity of these languages.
Burmese (http://my.wikipedia.org)
Nepal Bhasha/Newari (http://new.wikipedia.org)
Nepali (http://ne.wikipedia.org)
Sinhala (http://si.wikipedia.org)
I know that there is no meaning in comparing an inactive wiki with less number articles like Assamese or Oriya, to highly active wikipedias like Hindi or Telugu. But for this month, let me treat all of them together. Next month onwards I would like to treat them as two separate entities.
I hope this initiative will improve the interaction between different Indian Language Wikipedias/wikipedians. We (Malayalam Wikipedians - http://ml.wikipedia.org) have been maintaining a similar comparison study of the major Indian Language wikipedias for the past two years. This analysis has helped us to understand the status of Malayalam Wikipedia as compared to other Indian Language Wikipedias. I hope this analysis will also help other Indian language wikipedias.
Please feel free to add your suggestions/analysis as comment to this post. I have divided this report into two different sections.
Statistical analysis of Wikipedias
Localization status of Mediawiki software
Following are the different topics covered under each section.
Wikipedia Statistics
Article statistics
Number of Articles
Number of Edits
Break up of edits (2009 February – 2010 January)
Edits per article
Number of new articles/day
Average size of an article (bytes)
Database size (in Mega Bytes)
Percentage of articles with size greater than 500 bytes
Percentage of articles with size greater than 2000 bytes (2 kilobytes)
User Statistics
MediaWiki Statistics
Localization statistics
Number of Articles
Wikipedia Language | Number of Articles | ||
2009 November | 2009 December | 2010 January | |
Assamese | 261 | 261 | 263 |
Bengali | 20,754 | 20,918 | 21,016 |
Bhojpuri | 2,480 | 2,481 | 2,481 |
Bishnupriya Manipuri | 23,424 | 24,733 | 24,738 |
Gujarathi | 11,255 | 11,904 | 12,579 |
Hindi | 52,144 | 52,645 | 53,216 |
Kannada | 7,596 | 7,741 | 7,846 |
Kashmiri | | | 375 |
Malayalam | 11,459 | 11,635 | 11,871 |
Marathi | 25,737 | 26,034 | 26,544 |
Odia (Oriya) | 553 | 553 | 553 |
Pali | | | 2,316 |
Punjabi | 1,490 | 1,492 | 1,505 |
Sanskrit | 3,883 | 3,887 | 3,914 |
Sindhi | | | 349 |
Tamil | 20,095 | 20,472 | 20,959 |
Telugu | 44,098 | 44,238 | 44,333 |
Urdu | | | 12,547 |
| | | |
Burmese | | | 2,938 |
Nepal Bhasha/Newari | | | 61,487 |
Nepali | | | 3,079 |
Sinhala | | | 2,153 |
Let us hope the stub articles will get more content as more active contributors arrive.
Number of Edits
Wikipedia Language | Number of Edits | ||
2009 November | 2009 December | 2010 January | |
Assamese | 8,926 | 9,134 | 9,290 |
Bengali | 5,51,486 | | 5,86,472 |
Bhojpuri | 52,553 | 53,203 | 54,099 |
Bishnupriya Manipuri | 4,18,566 | 4,29,198 | 4,36,153 |
Gujarathi | 63,578 | 67,769 | 72,492 |
Hindi | 5,51,162 | 5,67,029 | 5,81,447 |
Kannada | 1,22,964 | 1,26,504 | 1,29,848 |
Kashmiri | | | 13,075 |
Malayalam | 5,33,391 | 5,51,307 | 5,69,056 |
Marathi | 4,45,205 | 4,58,769 | 4,74,113 |
Odia (Oriya) | 19,805 | 20,052 | 20,321 |
Pali | | | 48,865 |
Punjabi | 16,980 | 17,426 | 18,176 |
Sanskrit | 67,151 | 68,557 | 70,132 |
Sindhi | | | |
Tamil | 4,59,441 | 4,71,678 | 4,83,481 |
Telugu | 4,69,481 | 4,76,825 | 4,83,390 |
Urdu | | | 2,70,868 |
Burmese | | | 30,503 |
Nepal Bhasha/ Newari | | | 458,066 |
Nepali | | | 40,363 |
Sinhala | | | 74,493 |
More number of edits from multiple contributors will enhance the quality of articles in a Wikipedia.
Break up of edits (2009 February – 2010 January)
Wikipedia Language | Bot edits | User Edits (Registered and Anonymous users) |
Assamese | 55 | 45 |
Bengali | 68 | 32 |
Bhojpuri | 92 | 8 |
Bishnupriya Manipuri | 96 | 4 |
Gujarathi | 37 | 63 |
Hindi | 49 | 51 |
Kannada | 53 | 47 |
Kashmiri | 83 | 17 |
Malayalam | 37 | 63 |
Marathi | 59 | 41 |
Odia (Oriya) | 92 | 8 |
Pali | 94 | 6 |
Punjabi | 55 | 45 |
Sanskrit | 82 | 18 |
Sindhi | 29 | 71 |
Tamil | 51 | 49 |
Telugu | 51 | 49 |
Urdu | 53 | 47 |
| | |
Burmese | 41 | 59 |
Nepal Bhasha/ Newari | 91 | 9 |
Nepali | 52 | 48 |
Sinhala | 12 | 88 |
Edits per article
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 19.3 | 20.0 | 20.2 |
Bengali | 16.7 | 17.0 | 17.2 |
Bhojpuri | 5.5 | 11.7 | 18.7 |
Bishnupriya Manipuri | 9.3 | 9.1 | 9.3 |
Gujarathi | 4.3 | 4.4 | 4.5 |
Hindi | 7.0 | 7.2 | 7.3 |
Kannada | 12.8 | 12.9 | 13.1 |
Kashmiri | 28.3 | 28.8 | 29.3 |
Malayalam | 26.6 | 27.1 | 27.4 |
Marathi | 13.1 | 13.3 | 13.5 |
Odia (Oriya) | 21.8 | 22.1 | 22.5 |
Pali | 17.7 | 18.0 | 18.3 |
Punjabi | 8.0 | 8.3 | 8.6 |
Sanskrit | 14.7 | 15.0 | 15.2 |
Sindhi | 23.5 | 23.8 | 24.1 |
Tamil | 16.3 | 16.5 | 16.5 |
Telugu | 8.0 | 8.0 | 8.1 |
Urdu | 15.9 | 16.0 | 16.2 |
| | | |
Burmese | 6.3 | 6.4 | 6.6 |
Nepal Bhasha/Newari | 3.0 | 2.9 | 2.9 |
Nepali | 9.7 | 10.0 | 9.9 |
Sinhala | 23.2 | 21.4 | 21.8 |
Number of new articles/day
Wikipedia Language | 2010 January |
Assamese | Not Available |
Bengali | 5 |
Bhojpuri | Not Available |
Bishnupriya Manipuri | Not Available |
Gujarathi | Not Available |
Hindi | 18 |
Kannada | Not Available |
Kashmiri | Not Available |
Malayalam | 8 |
Marathi | 17 |
Odia (Oriya) | Not Available |
Pali | Not Available |
Punjabi | Not Available |
Sanskrit | Not Available |
Sindhi | Not Available |
Tamil | 16 |
Telugu | 3 |
Urdu | 5 |
| |
Burmese | 3 |
Nepal Bhasha/ Newari | 45 |
Nepali | Not Available |
Sinhala | 3 |
The above table shows the average number of articles created in a wiki daily.
Average size of an article (bytes)
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 2506 | 2506 | 1492 |
Bengali | 1342 | 1383 | 1407 |
Bhojpuri | 118 | 119 | 119 |
Bishnupriya Manipuri | 1084 | 1086 | 1090 |
Gujarathi | 1056 | 1098 | 1099 |
Hindi | 1182 | 1235 | 1275 |
Kannada | 1923 | 2526 | 2806 |
Kashmiri | 424 | 422 | 420 |
Malayalam | 2690 | 2725 | 2740 |
Marathi | 768 | 777 | 800 |
Odia (Oriya) | 236 | 236 | 236 |
Pali | 141 | 141 | 141 |
Punjabi | 741 | 740 | 759 |
Sanskrit | 184 | 187 | 197 |
Sindhi | 4092 | 4080 | 4070 |
Tamil | 2118 | 2441 | 2574 |
Telugu | 832 | 883 | 915 |
Urdu | 1535 | 1554 | 1550 |
| | | |
Burmese | 2986 | 3037 | 3033 |
Nepal Bhasha/ Newari | 707 | 805 | 882 |
Nepali | 1256 | 1282 | 1259 |
Sinhala | 5892 | 5430 | 5452 |
Database size (in Mega Bytes)
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 1.5 | 1.5 | 1.5 |
Bengali | 81 | 84 | 86 |
Bhojpuri | 4.8 | 4.8 | 4.8 |
Bishnupriya Manipuri | 65 | 65 | 65 |
Gujarathi | 32 | 35 | 37 |
Hindi | 165 | 174 | 181 |
Kannada | 42 | 53 | 59 |
Kashmiri | 0.77 | 0.77 | 0.78 |
Malayalam | 88 | 90 | 93 |
Marathi | 63 | 64 | 67 |
Odia (Oriya) | 1.2 | 1.2 | 1.2 |
Pali | 4.7 | 4.7 | 4.7 |
Punjabi | 4 | 4 | 4.1 |
Sanskrit | 6.6 | 6.6 | 6.8 |
Sindhi | 2.8 | 2.8 | 2.9 |
Tamil | 119 | 138 | 148 |
Telugu | 97 | 103 | 107 |
Urdu | 40 | 42 | 42 |
| | | |
Burmese | 24 | 26 | 26 |
Nepal Bhasha/Newari | 107 | 128 | 144 |
Nepali | 9.2 | 9.5 | 9.8 |
Sinhala | 34 | 39 | 40 |
Percentage of articles with size greater than 500 bytes
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 41 | 41 | 41 |
Bengali | 56 | 57 | 57 |
Bhojpuri | 2 | 2 | 2 |
Bishnupriya Manipuri | 85 | 86 | 86 |
Gujarathi | 19 | 19 | 20 |
Hindi | 42 | 43 | 43 |
Kannada | 54 | 55 | 55 |
Kashmiri | 12 | 12 | 12 |
Malayalam | 84 | 84 | 84 |
Marathi | 26 | 26 | 27 |
Odia (Oriya) | 2 | 2 | 2 |
Pali | 1 | 1 | 1 |
Punjabi | 16 | 16 | 16 |
Sanskrit | 4 | 5 | 5 |
Sindhi | 61 | 60 | 60 |
Tamil | 81 | 82 | 82 |
Telugu | 22 | 22 | 22 |
Urdu | 55 | 55 | 55 |
| | | |
Burmese | 67 | 67 | 67 |
Nepal Bhasha/Newari | 60 | 62 | 63 |
Nepali | 55 | 56 | 54 |
Sinhala | 78 | 81 | 81 |
The above table shows the percentage of articles with size more than 500 bytes.
Percentage of articles with size greater than 2000 bytes (2 kilobytes)
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 22 | 22 | 22 |
Bengali | 14 | 14 | 15 |
Bhojpuri | 1 | 1 | 1 |
Bishnupriya Manipuri | 1 | 1 | 1 |
Gujarathi | 5 | 5 | 5 |
Hindi | 9 | 10 | 10 |
Kannada | 15 | 16 | 17 |
Kashmiri | 5 | 5 | 5 |
Malayalam | 34 | 35 | 35 |
Marathi | 6 | 7 | 7 |
Odia (Oriya) | 1 | 1 | 1 |
Pali | 0 | 0 | 0 |
Punjabi | 8 | 8 | 8 |
Sanskrit | 1 | 1 | 1 |
Sindhi | 33 | 33 | 33 |
Tamil | 24 | 25 | 25 |
Telugu | 8 | 8 | 8 |
Urdu | 17 | 17 | 17 |
| | | |
Burmese | 38 | 39 | 39 |
Nepal Bhasha/Newari | 2 | 7 | 11 |
Nepali | 10 | 10 | 10 |
Sinhala | 53 | 53 | 53 |
Number of active wikipedians
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 1 | 1 | 1 |
Bengali | 25 | 35 | 32 |
Bhojpuri | 1 | 1 | 1 |
Bishnupriya Manipuri | 6 | 6 | 4 |
Gujarathi | 7 | 9 | 8 |
Hindi | 50 | 62 | 51 |
Kannada | 24 | 22 | 22 |
Kashmiri | 0 | 0 | 1 |
Malayalam | 56 | 50 | 65 |
Marathi | 22 | 25 | 36 |
Odia (Oriya) | 0 | 0 | 0 |
Pali | 0 | 1 | 0 |
Punjabi | 1 | 1 | 4 |
Sanskrit | 4 | 5 | 6 |
Sindhi | 1 | 0 | 2 |
Tamil | 45 | 55 | 53 |
Telugu | 38 | 34 | 26 |
Urdu | 24 | 20 | 20 |
| | | |
Burmese | 1 | 1 | 4 |
Nepal Bhasha/Newari | 3 | 3 | 3 |
Nepali | 8 | 5 | 5 |
Sinhala | 45 | 37 | 7 |
Page views per month (All figures in Lakh page views/month)
Wikipedia language | 2009 November | 2009 December | 2010 January |
Assamese | 0.87 | 0.93 | 0.86 |
Bengali | 22 | 28 | 24 |
Bhojpuri | 0.09 | 0.09 | 0.11 |
Bishnupriya Manipuri | 13 | 15 | 14 |
Gujarathi | 4.6 | 5.4 | 4.9 |
Hindi | 41 | 49 | 41 |
Kannada | 8.08 | 9.16 | 7.72 |
Kashmiri | 0.52 | 0.57 | 0.51 |
Malayalam | 27 | 28 | 26 |
Marathi | 23 | 28 | 24 |
Odia (Oriya) | 0.41 | 0.42 | 0.40 |
Pali | 0.85 | 0.83 | 0.82 |
Punjabi | 1.24 | 1.28 | 1.33 |
Sanskrit | 2.00 | 2.11 | 2.17 |
Sindhi | 0.57 | 0.61 | 0.54 |
Tamil | 24 | 26 | 24 |
Telugu | 37 | 41 | 33 |
Urdu | 11 | 10 | 10 |
| | | |
Burmese | 1.58 | 1.60 | 1.77 |
Nepal Bhasha/Newari | 13 | 14 | 15 |
Nepali | 1.44 | 1.47 | 1.55 |
Sinhala | 2.82 | 3.02 | 3.50 |
Media Wiki Localization status (percentage)
Language | Most often used messages | MediaWiki messages | Extensions used by Wikimedia | All extensions |
Assamese | 98.08 | 43.83 | 1.86 | 1.61 |
Bengali | 100.00 | 82.36 | 46.09 | 22.25 |
Bhojpuri | 0.21 | 0.08 | 0.00 | 0.00 |
Bishnupriya Manipuri | 100.00 | 52.51 | 0.11 | 0.30 |
Gujarathi | 100.00 | 40.79 | 5.91 | 6.59 |
Hindi | 99.36 | 97.22 | 29.43 | 26.44 |
Kannada | 100.00 | 59.63 | 3.55 | 3.21 |
Kashmiri | | | | |
Malayalam | 100.00 | 97.90 | 98.00 | 51.77 |
Marathi | 98.72 | 75.88 | 26.19 | 37.13 |
Odia (Oriya) | 4.48 | 1.39 | 0.25 | 0.30 |
Pali | 0.21 | 0.08 | 0.00 | 0.00 |
Punjabi | 56.08 | 30.26 | 0.42 | 0.42 |
Sanskrit | 97.65 | 27.22 | 0.00 | 0.34 |
Sindhi | 73.13 | 24.91 | 0.11 | 0.07 |
Tamil | 92.32 | 74.71 | 1.02 | 1.77 |
Telugu | 100.00 | 100.00 | 65.41 | 52.57 |
Urdu | 71.64 | 38.77 | 1.75 | 1.12 |
| | | | |
Burmese | 29.00 | 10.45 | 0.07 | 0.02 |
Nepal Bhasha/Newari | 32.84 | 12.35 | 0.04 | 0.01 |
Nepali | 96.59 | 68.77 | 0.98 | 0.90 |
Sinhala | 100.00 | 100.00 | 28.59 | 20.06 |
GerardM a translate wiki administrator, has been passing the below message to most of the Indian language wikipedias. He has also send mails regarding this to wikimediaindia mailing list a couple of times.
We expect that with the implementation of Localisation update the usability of MediaWiki for your language will improve. We are now ready to look at other aspects of usability for your language as well. There are two questions we would like you to answer: Are there issues with the new functionality of the Usability Initiative Does MediaWiki support your language properly. The best way to answer the first question is to visit the translatewiki.net...
Localization of the Wiki software is very important when we are trying to reach to prospective Wikipedians in any language. A website with interface and all system messages in the local language has edge over a website with English only content among that particular language community. Here comes the role of Localization of MediaWiki software. We use translate wiki for coordinating the localization efforts of all the languages. Two administrators of translate wiki, Siebrand and GerardM are available in this list also.
Some of the Malayalam Wikipedians (including me) have understood the importance of localizing the Media wiki messages to Malayalam long back (more than 2 years ago). From the above table you can understand that Malayalam is in the forefront in localizing the mediaWiki and other related system messages. Now a days Malayalam Wikipedian Praveen Prakash is coordinating the localization efforts of Media Wiki messages for Malayalam.
I request the respective community to give top priority in localizing the Mediawiki messages to your language. When you do so you are localizing the interface and system messages to your respective language. Apart from helping the Wiki projects of your language you are also helping a native user to use Media wiki software with the native language support.
Nepali is spoken in India, see the Wikipedia article of Nepali.
ReplyDelete