Card name normalization on punctuation needs improved consistency

User avatar
spacemonaut
Bauble reclaimer
Posts: 1378
Joined: 4 years ago
Answers: 10
Pronoun: she / her
Location: Scotland

Post by spacemonaut » 3 years ago

Card name recognition is a bit quirky and inconsistent right now in how it handles punctuation.

When a name has a comma in it, card completion will helpfully recognise the card name even if you type the name wrong:

Image

However, this is not consistent for all punctuation in the name: I can't skip the apostrophe.

Image

(This list appears to tell us what card names will generate a preview. Atraxa Praetors' Voice and Atraxa Praetors' Voice both display the card name, but the version without an apostrophe, Atraxa Praetors Voice, which is missing from autocomplete, does not generate a preview image.)

This treatment is also not consistent for all cards with commas in their names. Niv-Mizzet's name is helpfully autocompleted even of I skip the dash:

Image

However, leaving out the comma is not fine in this case:

Image

This again extends to image hovers: Niv-Mizzet, Dracogenius and Niv Mizzet, Dracogenius both work, but not Niv Mizzet Dracogenius nor Niv-Mizzet Dracogenius.

For some cards, no punctuation is allowed to be left out:

Image

Image

Hovers again: Atris, Oracle of Half-Truths vs Atris Oracle of Half-Truths (no comma), Atris, Oracle of Half Truths (no dash), Atris Oracle of Half Truths (both missing)

Image

Image

Hovers again: Acornelia, Fashionable Filcher vs Acornelia Fashionable Filcher

Some cards have a version suggested that includes quotation marks added which I think is a leftover from preview season:

Image

So far I've been able to identify "Acolyte of Affliction", "Arasta the Endless Nest", "Hero of the Games", "Mischievous Chimera", "Many-Faced Thaumaturgus" "Nightmare Shepherd", "One with the Stars", "Slaughter-Priest of Mogis", and "Sweet Oblivion" in the list of cards with this property.

Going by "Arasta the Endless Nest", which completes to Arasta of the Endless Web, these all seem to be a leftover from foreign cards in THB's season.




I have the following suggestions:
  • The system should normalize consistently over punctuation marks. If I can skip a comma or a dash in one card name, I should be able to do it for other card names consistently.
  • While the card name parser should recognise what I mean when I type out incorerct names like "Atraxa Praetors' Voice", the autocomplete should not suggest them. If I've typed in "Atraxa Prae", recognising I mean "Atraxa, Praetors' Voice" is great, but it should suggest that correct name and only the correct name.
  • If autocomplete recognises temporary preview season names even after preview season ends and the card name has been updated, it should also still suggest only the actual name. (Personally I'm not sure the autocomplete needs to continue recognising those names.)

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 3 years ago

So right now these are just "nicknames" for cards we've put in. Of course, this was before the autocomplete, which I agree now makes it confusing.

I'll dig into it the next few days and see if I can make it now streamline and forgiving
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 3 years ago

This will still need a larger overhaul soon, but I've improved it decently and added as much of the above as possible for now.

I've automated a nickname process so that all cards have a searchable nickname that excludes all apostrophes, commas, etc.
Limit of current implementation: we only have a singular nickname, so if you use a comma but not a hyphen it'll suddenly not match since it's either all or none. I'll have to fork our implementation into using an entire Nickname database, as well as some fuzzy matching, instead of just a nickname field.

The autocomplete now won't show the oracle name and the nickname.
Current limitation: It will show the oracle name as long as you match the proper name, but once you match only the nickname it will show the nickname. This is a limitation of at.js that we are currently using (the list has to match what's typed, even though I've done the work before hand). I'll have to fork phpbb Mention anyways, since that's what we're piggy backing, and I'll either find an at.js replacement that works as needed, or perhaps fork it for our use.
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
ISBPathfinder
Bebopin
Posts: 2161
Joined: 4 years ago
Pronoun: he / him
Location: SD, USA

Post by ISBPathfinder » 3 years ago

sad robot this little guy didn't find him. (I did find gary though so that was cool)
[EDH] Vadrok List (Suicide Chads) | Evelyn List (Vamp Mill) | Sanwell List | Danitha List | Indominus List | Ratadrabik List

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 3 years ago

ISBPathfinder wrote:
3 years ago
sad robot (I did find gary though so that was cool)
I could have sworn I added sad robot, but I've done a few iterations of automated nicknames.
I'll definitely look at creating a new database of nicknames, as well as fuzzy, etc.
Sad robot will come back. ^_^
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
ISBPathfinder
Bebopin
Posts: 2161
Joined: 4 years ago
Pronoun: he / him
Location: SD, USA

Post by ISBPathfinder » 3 years ago

@Feyd_Ruin Hummm a bit off topic but it looks like I can't get Legion's Landing to show up without referring to it as Legion's Landing // Adanto, the First Fort.

That might be..... slightly inconvenient lol. Using the card name search function I could find it but when using copy / paste I thought maybe I had the wrong ' in there or something.

EDIT: Are you ninja fixing things? I swear it was like 30 seconds after I posted this that it worked.
[EDH] Vadrok List (Suicide Chads) | Evelyn List (Vamp Mill) | Sanwell List | Danitha List | Indominus List | Ratadrabik List

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 3 years ago

ISBPathfinder wrote:
3 years ago
EDIT: Are you ninja fixing things?
Always.
I also adjusted the automated nicknames for the other split cards.
I swear its whack-a-mole and it really needs a full table.
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 3 years ago

So I took another stab and figured out a couple of the issues that were holding us back.
The list will now show you the proper oracle names at all times, even when you're off (no more nicknames showing).

I've added Soundex fuzzy to the search:
image.png

I will admit, though, that it can show some odd balls mixed in with what you're looking for, and legitimate close matches:
image.png
Still needs a full nickname database workup to take it to the level we really want.

@spacemonaut
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
spacemonaut
Bauble reclaimer
Posts: 1378
Joined: 4 years ago
Answers: 10
Pronoun: she / her
Location: Scotland

Post by spacemonaut » 3 years ago

Looking great. :) It does give odd results but it gives me correct results too, which is super.
Feyd_Ruin wrote:
3 years ago
This will still need a larger overhaul soon, but I've improved it decently and added as much of the above as possible for now.

I've automated a nickname process so that all cards have a searchable nickname that excludes all apostrophes, commas, etc.
Limit of current implementation: we only have a singular nickname, so if you use a comma but not a hyphen it'll suddenly not match since it's either all or none. I'll have to fork our implementation into using an entire Nickname database, as well as some fuzzy matching, instead of just a nickname field.

The autocomplete now won't show the oracle name and the nickname.
Current limitation: It will show the oracle name as long as you match the proper name, but once you match only the nickname it will show the nickname. This is a limitation of at.js that we are currently using (the list has to match what's typed, even though I've done the work before hand). I'll have to fork phpbb Mention anyways, since that's what we're piggy backing, and I'll either find an at.js replacement that works as needed, or perhaps fork it for our use.
I don't know if this is still relevant to the current implementation given "niv-mizzet draco" definitely is still autocompleting, but it sounds like you may want to normalize the user input as well as the database record.

If you store a normalized punctuation-free version of the name, then take what the user's typing and create a normalized punctuation-free version of that too, then compare those. If the database stores "akiri line slinger" then any variation on "Akiri, Line-Slinger" would be normalized according to the same rules too before being compared. You can do that normalization client-side or server-side in the autocomplete lookup.
Feyd_Ruin wrote:
3 years ago
So I took another stab and figured out a couple of the issues that were holding us back.
The list will now show you the proper oracle names at all times, even when you're off (no more nicknames showing).

I've added Soundex fuzzy to the search:
image.png


I will admit, though, that it can show some odd balls mixed in with what you're looking for, and legitimate close matches:
image.png

Still needs a full nickname database workup to take it to the level we really want.

spacemonaut
You could run a levenshtein comparison with a max distance of 3-4?

Post Reply Previous topicNext topic

Return to “Community Software Feedback and Bug Reports”