अनुक्रमणिका

किसी ग्रन्थ या किसी अन्य कृति में आये हुए प्रमुख शब्दों को वर्णक्रम में या किसी अन्य क्रम में सजाकर निर्मित सूची को अनुक्रमणिका या कॉन्कॉर्डैंस (concordance) कहते हैं। कम्यूटर प्रोग्रामों के आ जाने से यह काम अपेक्षाकृत बहुत सरल हो गया है किन्तु इसके पहले बहुत महत्वपूर्ण कृतियों (जैसे वेद, बाइबल, कुरान, शेक्सपीयर की कृतियाँ आदि) की ही अनुक्रमणिका निर्मित किये जा सके थे क्योंकि इसमें समय और खर्च बहुत लगता था और काम बहुत कठिन था।

विश्व की पहली अनुक्रमणिका 'वल्गेट बाइबल' (Vulgate Bible) की बनी थी जिसे सन्त चर के हग (Hugh of St Cher) (मृत्यु 1262) ने ५०० भिक्षुओं की सहायता से संकलित किया था। सन् 1448 में रब्बी मॉर्डेकाई नाथन (Rabbi Mordecai Nathan) ने हिब्रू भाषा की बाइबल की अनुक्रमणिका बनायी जिसे बनाने में दस वर्ष लगे।

भाषाविज्ञान में अनुक्रमणिका का प्रयोग संपादित करें

जब किसी पाठराशि (टेक्स्ट) का अध्ययन/विश्लेषण करना होता है उस समय भाषाविज्ञान में अनुक्रमणिका का प्रायः प्रयोग होता है। कुछ उदाहरण नीचे दिये जा रहे हैं-

किसी एक ही शब्द के भिन्न-भिन्न प्रयोगों (usages) की तुलना करना
मुख्य शब्दों (keywords) का विश्लेषण
शब्दों की बारम्बारता या आवृति (frequencies) का अध्ययन
मुहावरों तथा लोकोक्तियों की खोज करना तथा उनका विश्लेषण
द्विपाठ (bitexts) और अनुवाद स्मृति (translation memories) में शब्दावली (terminology) आदि के अनुवाद खोजना
वर्णानुक्रम-सूची तथा शब्दसूची निर्मित करने के लिये (प्रकाशन के लिये उपयोगी)।

अमेरिकी राष्ट्रीय कॉर्पस, ब्रितानी राष्ट्रीय कॉर्पस आदि में अनुक्रमणिका की तकनीकों का बहुतायत में प्रयोग होता है।

वे कम्प्यूटर प्रोग्राम जो अनुक्रमणिका तकनीकों का प्रयोग करने की सुविधा प्रदान करते हैं, 'संवादित्र' (concordancers) कहलाते हैं।

कॉर्पस भाषाविज्ञान में प्रयुक्त संवादित्र (Concordancers) संपादित करें

AntConc (freeware developed by Laurence Anthony at Waseda University, Japan)
ApSIC Xbench
MonoConc (commercial software developed by Michael Barlow)
PowerConc (freeware, developed by researchers at the National Research Centre for Foreign Language Education, Beijing Foreign Studies University, China)
WordSmith (commercial software developed by Mike Scott)
Sketch Engine (commercial software developed by Lexical Computing Ltd.)
NoSketch Engine (open source)
GlossaNet/Unitex (open-source free software),
AdTAT(free software developed by The University of Adelaide)
CorpusEye (corpus search interface)
KH Coder (open-source free software)
myCAT from Olanto (open-source)
Linguistic Toolbox (freeware)। ^[1]^[2]

सन्दर्भ संपादित करें

↑ "Linguistic Toolbox". मूल से 2 अप्रैल 2016 को पुरालेखित. अभिगमन तिथि 24 मार्च 2016.
↑ It has an integrated part-of-speech tagger that allows the user creating his/her own PoS-annotated corpora to conduct various type of searches adopted in corpus linguistics.

इन्हें भी देखें संपादित करें

पाठसंग्रह (कॉर्पस / Corpus)
multiple tool Archived 2023-08-29 at the वेबैक मशीन

बाहरी कड़ियाँ संपादित करें

AntConc - A freeware concordance program for Windows, Macintosh OS X, and Linux. देवनागरी में भी काम करता है।
Simple Concordance Program (SCP) (निःशुल्क ; गैर-देवनागरी लिपियों के लिये अत्यन्त उपयोगी; किन्तु देवनागरी में मात्राओं को नहीं देख पाता। डेटाफाइल में देवनागरी के कैरेक्टर कैसे देने हैं, समझ नहीं आता)
TextSTAT - Simple Text Analysis Tool (TextSTAT 2.9c for Windows देवनागरी के लिये ठीक से काम नहीं कर रहा है। मात्राओं को नहीं देखता।)
Unitex - Unitex is a corpus processing system, based on automata-oriented technology.
Hindi Text Analysis, Text Processing and Concordance
Alex Catalogue of Electronic Texts - The Alex Catalogue is a collection of public domain electronic texts from American and English literature as well as Western philosophy. Each of the 14,000 items in the Catalogue are available as full-text but they are also complete with a concordance. Consequently, you are able to count the number of times a particular word is used in a text or list the most common (10, 25, 50, etc.) words.
Concord - Page includes link to Concord, an on-the-fly KWIC concordance generator. Works with at least some non-Latin scripts (modern Greek, for instance)। Multiple choices for sorting results; multi-platform; Open Source.
ConcorDance - A concordance interface to the WorldWideWeb, it uses Google's or Yahoo's search engine to find concordances and can be used directly from the browser.
KH Coder - A free software for KWIC concordance and collocation stats generation. Various statistical analysis functions are also available such as co-occurrence network, multidimensional scaling, hierarchical cluster analysis, and correspondence analysis of words.

यह विज्ञान-सम्बन्धी लेख एक आधार है। जानकारी जोड़कर इसे बढ़ाने में विकिपीडिया की मदद करें।

[1] "Linguistic Toolbox". मूल से 2 अप्रैल 2016 को पुरालेखित. अभिगमन तिथि 24 मार्च 2016.

[2] It has an integrated part-of-speech tagger that allows the user creating his/her own PoS-annotated corpora to conduct various type of searches adopted in corpus linguistics.

[1]

[2]