The scripts here are collected for use by people studying language, not entertainment. The scripts have scenes out of order and don't tag the names associated with dialogue lines. These scripts can not be read as a substitute for the games they're from.
These scripts are stripped down so that they cannot be used as a substitute for the original visual novels. They only contain information that's useful to corpus linguistics, statistics, and language learners.
New: The stats project is moving to a wiki.
Stats page (may go out of date)
How the stats are generated
If you have a script not listed here, please post it on /jp/'s Daily Japanese Thread. Include the text "vn script" in your post so I can find it. Dies irae has the ideal script format. Newlines are line breaks, not line wraps. Pagebreaks are double newlines. Dialogue is marked with 「」. Furigana is marked with 《》, with the associated span marked with 〈〉. If 《》〈〉 show up in the original script, they should be replaced with «»‹›.
If you're donating, I prioritize untouched data files that are already easy for me to dump, like the following: BGI (either version), RScript, PJAdv, AdvTry, ExHIBIT, FVP, krkr .scn, Majiro, Musica, TamoGameSystem, Willadv (some versions), igscript. Some textual script krkr games are easy to dump, but they can have pretty much any format.
The format of all scripts is utf-8 because shift-jis does not contain all characters used in all VNs. Fate/Stay Night and Dies Irae are obvious examples that can't be encoded with shift-jis. For reference, there are three original encodings for these scripts: Shift-JIS, UTF-8, and UTF-16. Shift-JIS is the most common, and UTF-16 is the least. Every script has to be encoded in the same encoding, so UTF-8 is used in order to avoid altering any weird text in the scripts, because it contains every character used in every script, and characters that don't exist in any standard encoding can be put in its private use area.
Right now this collection has about 25% the amount of text it needs to be a reasonable corpus. Please help me build it up. Even doujin writing is fine.
2017-10-01 Added Oretsuba, its prelude, and its afterstory. These were very hard to rip. If there's a problem please report it somewhere.
2017-09-19 Added several VNs donated by an anon: Iinchou wa Shounin Sezu! ~It Is a Next Choice~, Hatsukoi Yohou, Kimi no Koe ga Kikoeru, Love of Ren'ai Koutei of LOVE!, Nora to Oujo to Noraneko Heart, Tsujidou-san no Virgin Road, Reincarnation ☆ Shinsengumi!, Prawf Clwyd, and Hen Koi ≒ Kuro Rekishi. Vnscripts is now an archive download. Legacy links will remain but won't be updated even if their scripts are redumped.
2017-09-09 Added Sen no Hatou, Tsukisome no Kouki, Futsuu no Fantasy, and Yoake Mae yori Ruriiro na, donated by an anon.
2017-09-08 Added White Album and Parfait, donated by an anon.
2017-08-09 Added Leyline 2 and 3, donated by an anon.
2017-07-30 Added Shugaten.
2017-07-30 Added Nanairo Reincarnation.
2017-07-30 Fixed stray formatting in Clover Point.
2017-07-29 Added Nanatsuiro Drops, Tarareba, Twinkle Crusaders, Clover Point, Fortune Arterial, and Dracu-Riot!.
2017-07-28 Added Trinoline, donated by anon.
2017-07-28 Added Daitoshokan no Hitsujikai.
2017-07-28 Added Daitoshokan no Hitsujikai -Dreaming Sheep-.
2017-07-26 Added Watashi ga Suki nara Suki tte Itte, donated by anon.
2017-07-11 Added Jingai Makyou, Tsukikage no Simulacre, Sharnoth Fvr, and the Chronobox Trials, donated by anon.
2017-06-26 Added Chronobox and Bishoujo Mangekyou -Norowareshi Densetsu no Shoujo-, donated by anon.
2017-06-26 Added Itsuka, Todoku, Ano Sora ni. Renamed hoshimemo's text file to hoshimemo.
2017-06-26 Added Senren Banka for an anon, and Cafe Sourire.
2017-06-26 Added Astelight Shuushuubako.
2017-06-26 Added Kagerou Touryuuki. Fixed Inganock linewraps using heuristics and a redump.
2017-06-26 Added Tsuki ni Yorisou Otome no Sahou.
2017-06-24 Added Tsuushinbo, donated by anon.
2017-06-11 Fixed FSN common route duplication and ruby text problems, and pagebreaks. Fixed kamimaho pagebreaks. Added Imakoi ~Succubus to Soitogeru Ore!?~, Tsujidou-san no Junai Road, Princess Frontier, and Narisokonai Snow White, donated by an anon.
2017-06-08 Added the first three entries in the Flowers series, donated by an anon.
2017-06-02 Fixed Kajiri, and two versions of it are given now. Only the longer version is used on the stats page.
2017-05-31 Fixed the linebreaks for the five VNs from light recently added. Kajiri is still truncated.
2017-05-31 Fixed some duplicate text, some missing text, and some possible formatting issues in Muramasa.
2017-05-31 Fixed Axanael linebreaks/pagewraps
2017-05-31 Fixed Axanael parsing problems
2017-05-31 Fixed missing text in Axanael. Still has parsing problems.
2017-05-31 Added Subarashiki Hibi and Axanael donated by anon. These were hard to parse (though not as hard as the other five) and might contain errors.
2017-05-31 Added five scripts donated by an anon: Silverio Vendetta, Sensinkan Bansenzin, Sensinkan Hatimyouzin, Kaziklu, and Kajiri. These were hard to parse and might contain errors.
2017-05-27 Fixed Katahane newlines and formatting (speaker names were marked; all line wraps were treated as line breaks instead of just ones that cause pauses)
2017-05-25 Fixed Muramasa formatting (some text was missing because my formatting removal regexes got confused)
2017-05-25 Added Majokoi
2017-05-25 Added Hanachirasu, Kamimaho, Soramitsu, Eustia
2017-05-23 Normalized all line endings to LF (Unix style), this reduces some script sizes by a few KB
2017-05-23 Added Leyline 1 and Flyable Heart
2017-05-23 updated most scripts to remove/fix leftover formatting. added Hanahira and Fureraba
2017-05-22 (12 hours later) Fixed shirokuma bell stars ruby text
2017-05-22 added shirokuma bell stars, ruby text currently in wrong format
2017-05-20 initial page: Dies irae, FSN, Aokana, Katahane, Hoshimemo, Muramasa, Satsukoi, Idea, Magicha, Inganock, Aete Mushisuru
*1 The format of this script can handle custom encoded text. This text has been mapped to the BMP's private use area.
** Several scripts don't record pagebreaks. For some of them, linebreaks and pagebreaks are the same thing. For others they're not. I probably won't ever fix this.
** A couple scripts have small amounts of duplicated scenes because of how the original scripts handle choices. That's not to say that all choice branches the affected VNs have do this, just a couple. This isn't enough of a problem to worry about, and I can't fix it automatically because the content of the scenes can be slightly different (this is the case in FSN) and fixing it by removing duplicate lines would remove "normal" duplication.
I have a few VNs that I haven't figured out how to rip yet. If you have tools, docs, or scripts for these games, let me know. Same /jp/ dealio.Haruka ni Aogi, Uruwashi no
Progress on donated scripts: https://pastebin.com/raw/TKnszHq9 -- If you want to see one of these dumped and it's not here yet, find someone else who can do it. I don't have the motivation left to do dumping anymore, just processing.
For modern krkr games with working encryption, use https://tlwiki.org/index.php?title=File:Kikiriki.rar - the files you get are going to be garbled and have no real filenames, but they usually look like they just have a single-value xor over them, and you can write something to detect binary script files based on every byte after the first byte by xoring by it. However, this only works for some modern encrypted krkr games.