
ChatGPT and other large language models (LLMs) are a major technology breakthrough, albeit some of the claims made about their capabilities is overstated. Rather than being genuinely creative or intelligent, what such models can do is to continually repackage, rephrase and recombine things that human beings have already created. This ability can be very powerful for specific use-cases, for example, where you want to pull together ‘content’ on a particular topic without needing to know the provenance of such content.
With this in mind, I was curious to find out what level of Sanskrit knowledge and ability ChatGPT could display. So I tried out prompts for ChatGPT to respond to, both on the OpenAI website which uses GPT-3.5 and in Bing Chat which uses the more up-to-date and more powerful GPT-4.
I started by asking in Sanskrit about the date of the Sanskrit poet Kālidāsa, with the prompt, ‘कालिदासस्य समयः क:?’ The less advanced GPT-3.5 responded in Hindi, throwing out two possible time periods as follows –
“कालिदास का जन्म साढ़े चारवीं सदी (4th century) में हुआ था। इसका अधिक निश्चित तिथि नहीं है, लेकिन विशेषज्ञों के अनुसार उनका जन्म साढ़े चारवीं सदी के आसपास, संभावत: 4th या 5th शताब्दी में हुआ था।”
Adding the instruction ‘Reply in Sanskrit’ to the above prompt produced a strange mix of Hindi and Sanskrit. By contrast, Bing (GPT-4) gave a much more impressive answer to the original prompt, not only replying in Sanskrit but giving a more thoughtful and detailed response, which carefully explained that there are a range of scholarly opinions as to his dating –
“कालिदासस्य समयः क: इति प्रश्नस्य निश्चितं उत्तरं नास्ति । विद्वांसः विभिन्नानि मतानि प्रतिपादयन्ति । केचन तं प्रथमशतकीयं मन्यन्ते, केचन द्वितीयशतकीयं, केचन पञ्चमं षष्ठं वा । केचन तं विक्रमादित्यस्य नवरत्नेषु अन्यतमं मन्यन्ते । केचन तं मालवसंवत् प्रवर्तकस्य विक्रमादित्यस्य समकालिकं मन्यन्ते ।
“कालिदासस्य समये सम्बन्धीयानि केचन जालपृष्ठाणि कालिदासः — विकिपीडिया, महाकवि कालिदास का सम्पूर्ण परिचय, कालिदास — विकिपीडिया इति सन्ति । एतेषु पृष्ठेषु कालिदासस्य समये सम्बन्धीया मता: प्रस्तुता: सन्ति ।”
This references both the Hindi Wikipedia page and the Sanskrit Wikipedia page on Kālidāsa. As the latter is indeed one major online Sanskrit source of information for this, I checked the Sanskrit Wikipedia page on Kālidāsa and found an extensive discussion of Kālidāsa’s dating, which GPT-4 appears to have skilfully condensed, not simply by extracting phrases but by rephrasing and summarising the content, with a handle on vocabulary and grammar.
I then asked for information about the Māṇḍūkya Upaniṣad, using the prompt, ‘माण्डूक्योपनिषत् किम् अस्ति ?’ I got similar results, although here even GPT-4 started throwing in a few Hindi particles, पर and के, right towards the end.
“माण्डूक्योपनिषत् एका उपनिषत् अस्ति, या अथर्ववेदे अन्तर्भवति । इयम् उपनिषत् ओङ्कारस्य व्याख्यानं कुरुते, यः सर्वं ब्रह्म निर्दिशति । इयम् उपनिषत् आत्मनः चतुर्विधं स्वरूपं प्रकाशयति, यत् जागृत्-स्वप्न-सुषुप्ति-तुरीय-अवस्थाभेदेन प्रकल्पितं । माण्डूक्योपनिषति केवलं १२ मन्त्राः सन्ति, किन्तु अस्या महती महिमा समाचारिता ।
“अधिकं जानितुम्, माण्डूक्योपनिषत् — विकिपीडिया परिशीलयतु । Mandukya Upanishad — Sanskrit Documents पर मूलम् संस्कृतम् पाठं पठितुम् शक्नोति । Mandukyopanishat with Karika by Gaudapada — Sanskrit Documents पर माण्डूक्योपनिषति के सहितं सगौडपादीयकारिका प्राप्ता ।”
The reason for adding पर twice seems to be that the model added two links with English titles and realised the need to decline these into the Sanskrit locative case but, quite understandably, struggled to know how this could be done, so it reached out to Hindi for the separable particle that this language provides. This shows how such models only seek to maximise statistical probabilities of each output, without even considering consistency within one language as a distinct constraint. This may also explain why GPT-3.5 preferred to respond in Hindi rather than in Sanskrit. A re-trained version of this model which is specifically tuned to narrow the domain of outputs to Sanskrit language is a possible way to improve this.
I next asked, ‘What does Ramayana say about cArvAka philosophy?’ Here there was a vast difference in output. GPT-3.5 started with the disappointingly inaccurate claim that “The Ramayana, one of the two major Sanskrit epics of ancient India, does not specifically address the Charvaka philosophy directly.” GPT-4, on the other hand, gave a thorough and insightful answer to this question.
LLMs are prone to so-called ‘hallucinations’, that is, they sometimes make up results which are not true. This is especially problematic because LLMs are optimised for plausibility rather than for truth, that is, they make even false claims appear entirely believable. This drawback is illustrated by the both answers to my next prompt, which was ‘What line of meghaduta has the word शुक्लापाङ्गैः?’ (I used the declined form to try to make the task easier.) However, every bit of both of the following answers, including the Sanskrit verses, is entirely made-up!
GPT-3.5 output-
The word “शुक्लापाङ्गैः” appears in the first verse of Meghaduta, which is as follows:
संदेशोऽसि प्रजानां नगरीरधिपतेर्द्युतिमतां।
तव शुक्लापाङ्गैः प्रणिहितः स्वमेघः परिक्षिपति॥
This verse translates to:
“You are the message for the prosperous citizens of the king of cities,
Sent by your gleaming white clouds, your cloud shall encircle it.”
In this verse, “शुक्लापाङ्गैः” (shuklāpāṅgaiḥ) means “white-winged,” referring to the gleaming white clouds.
GPT-4 output –
The word shuklApAGgaiH (शुक्लापाङ्गैः) appears in the 76th stanza of Meghaduta, in the second part (Uttara-megha). It means “with white streaks” and refers to the cloud that the yaksha is addressing. The full stanza is:
शुक्लापाङ्गैर्विनिर्मुक्तैर्विद्युत्कोटिसमैरपि । अविरामं विलोकेऽस्मिन् विद्युत्कोटिरिवाम्बरे ॥
shuklApAGgairvinirmuktairvidyutkotisamairapi | avirAmaM viloke’smin vidyutkotirivAmbare ||
It can be translated as:
With white streaks freed from even flashes of lightning, You look incessantly in this sky like a flash of lightning
To see how we could benefit from the vivid imagination of LLMs, I then wrote the following prompt in English: ‘Write a subhasita about knowledge’. Even without using the word Sanskrit, both GPT-3.5 and GPT-4 composed ‘original’ śloka couplets, with accompanying transliteration and translation, the latter also including some extra information about what subhasitas are. The verses and stated translations were as follows –
GPT-3.5 output –
ज्ञानं परमं धनं। विद्या ददाति विनयं।
अज्ञानं मतिर्भद्रं यत्र नान्यत्र पश्यति।
Knowledge is the ultimate wealth. It bestows humility.
Ignorance is the root of all evils; in knowledge, one sees solutions everywhere.
This attempt by GPT-3.5 appears to miss a syllable in the first and third pādas. The whole second line doesn’t make sense and doesn’t correspond to the stated translation. This output does show, however, that GPT-3.5 is capable of interacting in Sanskrit, although, unlike GPT-4, it can’t reliably understand when a Sanskrit response is wanted.
GPT-4 output –
विद्या विनयं ददाति विनयाद् याति पात्रता ।
पात्रत्वात् धनमाप्नोति धनाद् धर्मं ततः सुखम् ॥
Knowledge gives humility, from humility comes worthiness. From worthiness one obtains wealth, from wealth one performs righteousness, from that comes happiness.
The attempt by GPT-4 is better both metrically and semantically, although the chain of reasoning is perhaps a bit tenuous and doubtless comes from recombining ideas and structures without a sufficient grasp of meaning, and probably due to a limited amount of relevant online content. A lack of external sandhi for the first word on the second line is probably because we find a lot of Sanskrit content online which doesn’t apply sandhi rules. On the whole, though, it is an impressive output. Again, an adapted LLM which uses a larger and more specific data set could improve the output still further.
I also tried various other prompts such as ‘Write a short kavya on a topic of your choice’ and ‘नासदीयसूक्ते भाष्यं लिखतुु’ with varying results. In a follow-up article, I will explore whether there are any compelling Sanskrit-related use-cases for ChatGPT and how outputs might be improved through further tuning and training of such models.
Leave a comment