Most of this information will not be necessary, as we are primarily interested in the character stats and vitals, namely: The data contains many different features, including user metadata of the submitted entry, as well as the actual character data of interest. The (messy) Jupyter notebook with this analysis can be found in this git repository if you’re interested in following along. I’ll be using python with the pandas and sklearn libraries as the standard toolset.
He lists a few potential caveats with the dataset, none of which seem to be showstoppers for the simple study I am trying to do. It contains over 7,900 character entries submitted by users via a web application form he created. In order to build a classifier and answer this question, I needed a dataset of DnD characters. In doing this, I began to ponder: how well-balanced is the actual gameplay? Do all optimizations simply lead to the same fundamental character build? Or can I deduce a character’s class based on a limited subset of character information? I wasn’t too keen to read the many sacred tomes that extensively document the game elements to optimize my warlock, but was gently nudged in the right direction by my new DnD compatriots to make the largest stat allocation to Charisma (alas, I have the actual charisma of a sweaty sock to do the role-play justice.) I bungled together a character I thought would be interesting to play - a variant aasimar hexblade warlock. Perhaps it could stave off the monotony that my life had become from thesis-writing in solitary COVID-19 confinement. Not that I was opposed to it, in fact, it sounded fun and I liked that it encouraged collaboration and quick-witted creativity. Despite my respectable nerd cred, I’d never actually played DnD. A couple of months ago, a friend invited me to join him in an online Dungeons and Dragons campaign.