What factors go into rating your deck?

User avatar
cryogen
GΘΔ†
Posts: 1056
Joined: 4 years ago
Pronoun: he / him
Location: Westminster, MD
Contact:

Post by cryogen » 4 years ago

"Casual"
"Competitive"
"A 7"

There is a lot of discussion surrounding how we determine the power level of our decks in order to better facilitate Rule 0. But what actually goes into that decision? By what metrics do you personally use to reach a decision on the power level of your deck?

Some considerations you might have:
How fast does your deck intend to win?
How consistent is your deck?
How important is winning to you when you play?
How well can your deck interact with your opponents, and is it capable of disrupting their potential win?


As a follow up question, what sort of things would be included in your ideal scale in order to find your ideal category and best matchup?
Sheldon wrote:You're the reason we can't have nice things.

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

I'm actually thinking about trying to write some code to analyze decklists, and this thread could be a cool resource particularly if people can make quantitative statements.

My initial thought is to run any number of algorithms against a decklist and assign a number of points for each metric, then total those points up and use them as a comparative tool. Grooming the dataset that assigns points would be the obviously most difficult part.

Some metrics I've been thinking of:

1) A points list of cards that are just high raw power. See the canadian highlander points list for example. Some cards are just very powerful no matter what your deck is. This list would need to be quite a bit larger than the canadian highlander list since we would want to include points for things not on their list like chord of calling.

Basically any card that increases your consistency or is ahead of curve could have some number of points, even cards like fact or fiction or llanowar elves could be worth some number of points just for the added consistency.

My thinking is you start with the most innocuous thing that might increase consistency on its own and allocate that as 1 point, then you find the most egregiously busted card in the format (mana crypt for example) and that's the most points (say, 50 or 100). Then try to put everything on that spectrum.

Say as a strawman --

Cultivate - 1 point
arcane signet - 10 points
windswept heath - 15 points
llanowar elves - 20 points
deathrite shaman - 25 points
birds of paradise - 30 points
chrome mox - 40 points
mox diamond - 50 points
mystical tutor - 60 points
vampiric tutor - 70 points
sol ring - 80 points
mana crypt - 100 points

2) Curve analysis - lower curve is likely to be higher power although that's not always the case (see that Karador 1-drops deck). Low curve will usually have something to say for consistency if nothing else.

3) An additional points list of combos (e.g. Kiki-Jiki, Mirror Breaker is not any points by himself but if your decklist includes any of the combo outlets it becomes worth points).

These could even be strong synergies, e.g. azusa, lost but seeking + crucible of worlds = 10 points.

Sample:
green sun's zenith + dryad arbor - 10 points
kiki-jiki, mirror breaker + any 5 CMC combo piece → 20 points
kiki-jiki, mirror breaker + any 4 CMC combo piece → 25 points
kiki-jiki, mirror breaker + any 3 CMC combo piece → 30 points
isochron scepter + dramatic reversal - 60 points
flash + protean hulk → 100 points

The idea is that you could crowd source these points numbers and then massage them until it starts to fit.

I suspect that you would get extremely close (within +/- 1 point on a 1-10 scale) very quickly from just doing a points list, combo and curve analysis




----------------------------------------------------------------------------------------------------

As far as what I personally do right now, I try to:

1) clearly identify i I am playing a CEDH deck.
2) Provide a strong estimate on a 1 to 10 scale based on a variety of factors, most of which you listed. Usually I'm thinking how fast is my deck and how strong is its gameplan and how good is it at disrupting other people's gameplans. Most of my decks go from 5-8.

One of the challenges I've have had in the past is with decks that are very, very strong if people don't have interaction. Many of my weaker decks will go absolutely bananas if people can't answer the board.

The interactivity axis is probably the hardest thing to assess. I have run into this with others as well. A good buddy of mine has a very strong Mairsil deck that will win by turn 6 or so pretty consistently but is absolutely dependent on her to win. If someone shuts Mairsil down he can't really do anything.

User avatar
toctheyounger
Posts: 3991
Joined: 4 years ago
Pronoun: he / him
Location: Auckland, New Zealand

Post by toctheyounger » 4 years ago

For me there's a coupe of factors that increase the perceived power level of my decks.

1) Speed - how quickly does deck establish a viable threat?
2) Resilience - how robust is the strategy? This counts in terms of card and hand variation as well as resistance to removal.
3) Presence of ubiquitous power pieces
4) Depth of synergy - obviously the deep sea depths of synergy is infinite combo, how far do does the deck go down this route and how easily are they enabled? 2-card pieces are 20,000 leagues under the sea, 4-card combos are scuba diving and so on.
5) Level of oppression - how much does the deck shut down the rest of the table and how easily does it do this?

That's more or less it for me. edit: I think this gets a lot harder to portray to other people if they either ostensibly don't care or don't share a similar set of parameters, so ultimately there's a fair percentage of games where even having this discussion is redundant.
Malazan Decks of the Fallen
| Shadowthrone/Lazav | Raest/Yidris | T'iam / The Ur-Dragon |

User avatar
DirkGently
My wins are unconditional
Posts: 4587
Joined: 4 years ago
Pronoun: he / him

Post by DirkGently » 4 years ago

If it's one of my decks, it's a 7.

Unless people don't like it. Then it's an 8.

Jeez, you guys have some complicated-ass rating systems going on over there. lol.

For other players, I want to know if someone's playing a cEDH deck, or if they're playing a deck with 2-card combos, MLD, etc. Tapping out turn 5 and getting blindsided by armageddon is not my fave. Beyond that I'm happy to figure it out on my own.
Perm Decks
Phelddagrif - Kaervek - Golos - Zirilan

Flux Decks
Gollum - Lobelia - Minthara - Plargg2 - Solphim - Otharri - Graaz - Ratchet - Soundwave - Slicer - Gale - Rootha - Kagemaro - Blorpityblorpboop - Kayla - SliverQueen - Ivy - Falco - Gluntch - Charlatan/Wilson - Garth - Kros - Anthousa - Shigeki - Light-Paws - Lukka - Sefris - Ebondeath - Rokiric - Garth - Nixilis - Grist - Mavinda - Kumano - Nezahal - Mavinda - Plargg - Plargg - Extus - Plargg - Oracle - Kardur - Halvar - Tergrid - Egon - Cosima - Halana+Livio - Jeska+Falthis+Obosh - Yeva - Akiri+Zirda - Lady Sun - Nahiri - Korlash - Overlord+Zirda - Chisei - Athreos2 - Akim - Cazur+Ukkima - Otrimi - Otrimi - Kalamax - Ayli+Lurrus - Clamilton - Gonti - Heliod2 - Ayula - Thassa2 - Gallia - Purphoros2 - Rankle - Uro - Rayami - Gargos - Thrasios+Bruse - Pang - Sasaya - Wydwen - Feather - Rona - Toshiro - Sylvia+Khorvath - Geth - QMarchesa - Firesong - Athreos - Arixmethes - Isperia - Etali - Silas+Sidar - Saskia - Virtus+Gorm - Kynaios - Naban - Aryel - Mizzix - Kazuul - Tymna+Kraum - Sidar+Tymna - Ayli - Gwendlyn - Phelddagrif - Liliana - Kaervek - Phelddagrif - Mairsil - Scarab - Child - Phenax - Shirei - Thada - Depala - Circu - Kytheon - GrenzoHR - Phelddagrif - Reyhan+Kraum - Toshiro - Varolz - Nin - Ojutai - Tasigur - Zedruu - Uril - Edric - Wort - Zurgo - Nahiri - Grenzo - Kozilek - Yisan - Ink-Treader - Yisan - Brago - Sidisi - Toshiro - Alexi - Sygg - Brimaz - Sek'Kuar - Marchesa - Vish Kal - Iroas - Phelddagrif - Ephara - Derevi - Glissa - Wanderer - Saffi - Melek - Xiahou Dun - Lazav - Lin Sivvi - Zirilan - Glissa - Ashling1 - Angus - Arcum - Talrand - Chainer - Higure - Kumano - Scion - Teferi1 - Uyo - Sisters
PDH - Drake - Graverobber - Izzet GM - Tallowisp - Symbiote
Brawl - Feather - Ugin - Jace - Scarab - Angrath - Vraska - Kumena
Oathbreaker - Wrenn&6

User avatar
Toshi
ʕ•ᴥ•ʔ
Posts: 645
Joined: 4 years ago
Pronoun: he / him
Location: Freiburg, Germany
Contact:

Post by Toshi » 4 years ago

For my decks it comes down to 3 factors in this order:

1. Consistency
How reliable are my mid-game plans and win cons? Does it take some very specific cards to get going or are draw, ramp and redundancy in place to reproduce lines of play?
2. Resiliency
How well does my deck recover from disruption? How well does it interact with others? How many cards and effects are there that can completely halt my strategy?
3. Speed
Is my deck going to win/take out the first opponent in an somewhat early turn like 4 or 6 if left unchecked? How easy is it for me to pick up pace, once board stabilize?

If i were ever forced to put it into numbers specifically, i'd propably rate my decks from 1 to 3 in each category, then sum things up. The resulting scale ranging from 3 to 9 would be pretty realistic, since i put too much effort into brewing decks to ever fall below 3 and exclude certain things like fast mana, fetches and full tutor suites that would would be needed to ever think of anything others would call a 10.

User avatar
toctheyounger
Posts: 3991
Joined: 4 years ago
Pronoun: he / him
Location: Auckland, New Zealand

Post by toctheyounger » 4 years ago

DirkGently wrote:
4 years ago
If it's one of my decks, it's a 7.

Unless people don't like it. Then it's an 8.

Jeez, you guys have some complicated-ass rating systems going on over there. lol.

For other players, I want to know if someone's playing a cEDH deck, or if they're playing a deck with 2-card combos, MLD, etc. Tapping out turn 5 and getting blindsided by armageddon is not my fave. Beyond that I'm happy to figure it out on my own.
There's a lot to be said for this approach. Being able to sit down at a table and go 'oh hey that's Dave, he plays 8 tops' is super easy.
Malazan Decks of the Fallen
| Shadowthrone/Lazav | Raest/Yidris | T'iam / The Ur-Dragon |

User avatar
BaronCappuccino
Posts: 246
Joined: 4 years ago
Answers: 1
Pronoun: he / him
Location: Quiet Corner

Post by BaronCappuccino » 4 years ago

I don't really know how to rate my decks, so I don't bother. I take ideas that work and make them clunky and less effective by giving them handicaps and pairing them together in unintended ways. I'm the steampunk of the deckbuilding community.

User avatar
darrenhabib
Posts: 1834
Joined: 4 years ago
Pronoun: Unlisted

Post by darrenhabib » 4 years ago

@pokken, I like that you're being ambitious and I'm a nerd and code and total love this sort of thing.

Of note, quite often the land base gives away power levels quite a bit.
As soon as an opponent plays a tapped land like Akoum Refuge you know it is probably going to be low powered deck.
So its a little tricky with just assigning points, say as a basic land might not be any points, but often if they play a land like this it would almost be negative points.
I feel like you could have an algorithm for the land base alone and be able to tell a lot about the deck.

As far as combos, I started working on an infinite and strong synergy combo thread, but I stopped because I got overwhelmed.
But I realized that I was better to try and organize a spreadsheet, and I'd be happy to work with you on implementing something of a "database" that you could work off.
But its the sort of thing that would probably take me 3-6 months to do. Trust me once you try and be comprehensive it starts to run away a little.

What can be a little weird is that Sol Ring is almost in every deck, due to price and availability, so assigning points to it probably has no effect to overall scale. But obviously Mana Crypt is harder to get hold of and even though its power level is comparable, its inclusion is going to be much more of a give away about the power level of the deck.
Same with Imperial Seal versus Demonic Tutor. Power levels are similar but DT is going to feature in most decks, where IS is only going to feature in potentially more serious decks.

Interestingly there is actually not much from stopping from eventually using this website to run the algorithm, as the cards are a database and its possible to assign the points to them. If that makes sense its no different to how other attributes get assigned like ratings.
Now obviously we'd need the website guru @Feyd_Ruin to be onboard and help with implementing something like this, but the idea would be that you could run the algorithm against deck lists. Don't worry Feyd I'm talking waaay off in the future for something like this as I'm sure it would take at least 6-12 months to have figured out ratings and an algorithm in another system anyway. But I just want to point out that this sort of thing is a possibility.
There are already some algorithm calculation tools on the site that you can check out here.

Assigning points will be pretty daunting as well, as a lot of it is going to be a bit subjective and biased.
Other things you'd want to account for is your commander, if its part of a combination, then you'd want to weight it a little higher.
For example an infinite mana combo should be weighted higher with a commander that can use infinite mana.
Also say you have Niv-Mizzet, Parun + Curiosity as part of your 99, then this isn't as strong if you have him as your commander.
But that's fine, that is something that I could work on as far as separating out combinations.

Obviously it would never be a perfect system, it would be just too hard to cover everything..well maybe.
I mean sure you can include Flash + Protean Hulk with points, but the deck might not actually have a way to win with it.
For example if you tried to include every combination like..
Flash + Protean Hulk + Thassa's Oracle+ Spellseeker + Demonic Consultation + Blood Artist
Flash + Protean Hulk + Thassa's Oracle + Nomads en-Kor + Cephalid Illusionist
and the other literally hundreds of ways you could win, then it starts escalating pretty uncontrollably.

Just like Isochron Scepter + Dramatic Reversal might actually suck in a deck because it doesn't run enough nonland mana soruces.

And then trying to assign points to Underworld Breach is a very hard thing to do, as so many factors go into how good it is in a deck. Just trying to assign an arbitrary number like "40" wouldn't be correct, given you need cards to make it good.
Trying to cover combinations I just don't think is possible.

But I think as long as we make concessions that its just a system to make approximations about deck strengths, then we don't setup for failure of trying to be perfect.

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

So my inclination was originally to write it in Python but I think I might just have to bone up on my javascript since it's a lot easier to host on a website. I'm pretty sure I could pack the entire thing into a single webpage honestly since the baseline stuff is pretty simple math / counting / loops.

Another idea I had was to try training an ML model- I feel like if we could give it a curated dataset of a thousand decks that were accurately rated from 1 to 10 we could get an ML model that was fairly accurate at predicting. I've only done one ML exercise though.

------------------------------------------

Quick comments

I think you're actually pretty spot on that a tight manabase is a good clue; it's one of the reasons I gave the fetchlands pretty high point values in my initial draft.

Re: Sol ring - since some people do not play sol ring on purpose I think it's pretty important to go ahead and not try to make concessions for ubiquity. Same with demonic tutor. If you're playing DT to find your darksteel reactor I think that'll show up in the rest of the list.

Even if you play a perfect manabase and all the mana rocks and tutors but then you're doing 20 cards worth of jank, the 20 cards worth of jank will bring your average down enough so we know it's not a CEDH deck.

The most likely scenario I think is that someone's deck is accidentally rated slightly high because their endgame is bad but their facilitation is good - I think that's something we could correct for by weighting certain finishers more highly, or by adding algorithms that say "well, if your curve is <2 and you're running 1000 points worth of mana rocks and 500 points of tutors, your entire score is multiplied by 1.3" or whatever.

------------------------------------------

It would surely be pretty easy to trick the algorithms if you were trying; but I honestly think just a curve analysis + checking for 100-200 or so cards with reasonably approximated point values would get you to 90% success at power level estimation on any real deck someone has built.

------------------------------------------

The key thing to do I think is to make the design flexible enough to be able to respond. One thing I was considering doing in the data set is to add tags to every card in addition to scores to allow some other angles of attack for adding new layers.

Card, Cost, Tags
Mana Crypt, 100, {fast_mana, rock, expensive, 0_cmc}
Sol Ring, 80, {fast mana, rock, ubiquitous, 1_cmc, no_drawback}
Enlightened Tutor, 55, {tutor, 1_cmc, instant, mirage_tutor, artifact_tutor, enchantment_tutor}
Trinket Mage, 3, {tutor, 3_cmc, creature, artifact_tutor}

Then if you wanted to do stuff like add +10% to the value of mana crypt if you have more than 2 {artifact_tutor} in your list you could do that. Or if you wanted to add 5% if you have a high powered card that can be searched for with trinket mage, blah blah, etc, etc.

Anyway, I'll leave off of sidetracking. What I think I might do is start working on the data model a bit more, since I think understanding what cards mean is the most important aspect of the whole thing.

When I get it moving I'll kick a thread off with the initial dataset.

In my head the key information pieces are:

Cards (point cost, tags)
Combinations (point cost, tags)
Curve (formula for curve multiplier)

My math is 20 years out of date, but initial thought for curve multiplier is like, 1/ln(x) where x is the curve, resulting in:
2.47 for 1.5
1.44 for 2
0.91 for 3
0.72 for 4

Might need some nudging but seems close...probably go pretty nuts on a deck that is all 1-drops (where a curve of 1.1 results in a multiplier of 10.5 or so, lol).

I hesitate to make it something other than a formula though since the difference between a 1.5 and 2 curve in EDH power is very high, and most CEDH decks stop at 1.6 or so -- I'd hate to have hard cutoffs where 1.61 and 1.6 get massively different multipliers :P

And wow I'm really done with that brain vomit. sorry guys :)

User avatar
Feyd_Ruin
Elder Vampire
Posts: 5410
Joined: 5 years ago
Answers: 3
Pronoun: he / him
Contact:

Post by Feyd_Ruin » 4 years ago

darrenhabib wrote:
4 years ago
Interestingly there is actually not much from stopping from eventually using this website to run the algorithm, as the cards are a database and its possible to assign the points to them. If that makes sense its no different to how other attributes get assigned like ratings.
There's quite a few questions that have to be well answered for this.

Who assigns the ratings and how?
People have different feelings on cards, so I would guess something community driven, but it only works if you get a lot of responses and throughput.

Are the ratings per card, or do they look for 2+ card interactions?
Flash is a perfectly balanced card with no power level issues or degenerate interactions, unless you flash in certain creatures. Then it becomes the bane of cEDH. I know people who run Flash as generally nothing more than a Quicken. Even "abused" with something like Thragtusk would generally be accepted by most as just a cool 2 card interaction. But then Protean Hulk...

Flash is a good example, but far from the only one. Titania's Song is a great defensive card, but mixing it with Mycosynth Lattice can be game crushing. Rings of Brighthearth is run to power level commanders with abilities all the time, but goes infinite with Basalt Monolith really fast. Palinchron vs Palinchron and High Tide. The list goes on.

Assuming we have to answer the last question with "yes", how complex do we go, and who drives all this data?
Do we go three cards deep? Four? Who lists all of these and rates their competitiveness?

These are the main three that need answered first. Please know these aren't rhetorical "why it can't happen" questions. If we can come up with strong answers, it makes it possible. The coding and math is actually the easy part to me. The database it's run against seems the hard part.
To the beaten, the broken, or the damned; the lost, and the wayward: wherever I may be, you will have a home.

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

re: Who

I dunno. That's the hard part. The numbers will have to be modified a lot, and tons of cards will need to be put in there.

If we start on my Cultivate = 1, Mana Crypt = 100 scale, there're a TON of cards in between those two, probably 500 or so.


re: Cards

Some individual cards are very powerful on their own and get points. Most of the cards on this list are already on the canadian highlander points list, but they do things like:

1) generate a ton of resources not commensurate with their costs (fast mana, yawg's will, mystic remora)
2) allow tutoring for specific cards at an aggressive cost (mirage tutors)

Just having rhystic study in your deck is good for (say, hypothetically) 25 points because it generates a crapload of cards, even if those cards are dinosaurs.

re: combos

My rough draft data model for card combos is a list of lists where to qualify for the combo a card must contain >= 1 from each list within the list.

So the Thassa's Oracle chains look like

[ [ "thassa's oracle", "jace, wielder of mysteries"], ["tainted pact", "demonic consultation"] ] == 80 points

This gets challenging when you then have another list that is

[ [ "thassa's oracle", "jace, wielder of mysteries"], ["paradigm shift"] ] = 30 points

My current thought is that this is just an OK problem to have -- you've got to be careful not to make the combo list so cumbersome that running a redundant combo gets you erroneously counted double, but I think it'd be fixable with some diligence.

edit: The other problem with my list of lists method is that it makes Jace-wielder + Tainted Pact equivalent in power to running all four, which is probably not accurate.

This could be balanced by giving all of the cards individual points that help fix it, or could be reduced to just a list (if you have all these 3 cards you get 100 points).



edit: draft data model

User avatar
Inkeyes22
Posts: 118
Joined: 4 years ago
Pronoun: he / him

Post by Inkeyes22 » 4 years ago

man, I really wish we had access to the original forum on mtgcommander.net. I typically go off the power level used here: https://www.youtube.com/watch?v=dDVrAb_ ... u.be&t=264 because it seems to be fairly universally used.

Here is a summary if you don't want to watch that video:
0-1: It's hard to consciously bring yourself to make a deck this actively bad unless you're going for an off-beat theme like "griffon tribal."
2-3: These decks might be a jumble of rares (and not necessarily good ones) that someone had in their collection. The commander exists to provide the color identity needed to play all of these favorite cards, even though many do not synergize with each other. Consistency is typically low because there is no coherent idea for the deck.
4-5: Most of the Commander pre-constructed deck are at this power level. There are some themes in the deck, but maybe the deck just plays slowly due a lack of ramp. Possibly there are too many ideas going on at once, or maybe the win-conditions are just not very powerful.
6-8: These decks have a very clear game plan. They are pretty efficient and/or flexible. Research and refinement of deckbuilding is likely. Upgraded pre-cons and decks that have gone through experimentation and updating typically are around here. The financial costs of these decks can be higher due to the expense of consistent, faster mana bases or sought-after singles that are high impact.
9-10: These decks are built to win. They are competitive-tier (CEDH) decks that are very fast and very consistent. Often, these decks use expensive cards (like Mox) to help speed up the deck even more.

This list is pretty comprehensive and a good idea of what type of game based on the commanders:

http://tappedout.net/mtg-decks/list-mul ... s-by-tier/

I have found that far too many people will say they have a 7, then they use Uro, Titan of Nature's Wrath to Dramatic Reversal/Isochron Scepter with a hidden commander Thrasios, Triton Hero win on turn 5. So if people say "mine is a 7 too" but they have Zur I keep up W and let them know it, even if I don't have Swords to Plowshares in hand.

User avatar
darrenhabib
Posts: 1834
Joined: 4 years ago
Pronoun: Unlisted

Post by darrenhabib » 4 years ago

Feyd_Ruin wrote:
4 years ago
darrenhabib wrote:
4 years ago
Interestingly there is actually not much from stopping from eventually using this website to run the algorithm, as the cards are a database and its possible to assign the points to them. If that makes sense its no different to how other attributes get assigned like ratings.
There's quite a few questions that have to be well answered for this.

Who assigns the ratings and how?
People have different feelings on cards, so I would guess something community driven, but it only works if you get a lot of responses and throughput.
Without being presumptuous, or actually asking for specific functionality, I'll give you my "perfect world" scenarios.

Because this data is specifically for rating commander decks using the pokken algorithm, it would only be a moderator that assigns point values. Realistically me & pokken, because we are the ones who are investing time into the idea.
Yes it will be subjective and biased. Obviously the point is to make points as realistic and balanced as possible (as we can think of), but its just not possible to put it entirely into a public pool to assign values initially because you need a starting point.
Now that's not to say over time that polls are not put out to try and balance point values, part of the fun will be less subjective results and that more aligned with the populace.
But it will never be a public thing like the ratings system we have.
At the end of the day, the disclaimer is that the algorithm has been designed by individuals. It is pokken algorithm and not a public designed system.
Are the ratings per card, or do they look for 2+ card interactions?
Flash is a perfectly balanced card with no power level issues or degenerate interactions, unless you flash in certain creatures. Then it becomes the bane of cEDH. I know people who run Flash as generally nothing more than a Quicken. Even "abused" with something like Thragtusk would generally be accepted by most as just a cool 2 card interaction. But then Protean Hulk...

Flash is a good example, but far from the only one. Titania's Song is a great defensive card, but mixing it with Mycosynth Lattice can be game crushing. Rings of Brighthearth is run to power level commanders with abilities all the time, but goes infinite with Basalt Monolith really fast. Palinchron vs Palinchron and High Tide. The list goes on.
So I'm going to approach this in the much bigger picture I have for the website. I've already hinted at you at some functionality that would be great to have with the website and that is having relationships between cards.
The idea is that anybody can build up relationships between cards.
You will have different relationships as well. X has a relationship to Y in a Z type manner.
Z might be "Similar to" so a thesaurus type relationship.
Z might be "Combos with", so these cards have synergy together.

So specifically Z could be..
  • Similar to
  • Better than
  • Worse than
  • Combos with
  • Referenced by
I'm sure we would come up with more, but as long as they are just considered attributes, any number of different relationships could be added to in the future if wanted.
Somebody might say can we have a "nonbos" (maybe "Conflicts with") relationship added, and the way its coded is that the answer is "yeah sure, no problem".

My vision is that the public can build up the relationships between cards. It might be that an additional layer of "approving" the relationship is needed by a moderator, just so that a bot or user doesn't do silly things to reference cards.

Then on top of this a "numbers" can be assigned to these relationship.
X=Flash
Y=Protean Hulk
Relationship=Combos with
EDHPoints=60

Only a moderator can assign the points value.

Now this doesn't cover X has a relationship to Y that has a relationship to A, etc. What I'm saying is that once you go past two cards relationships, the idea becomes much harder.

It might be that in order to build up N numbered relationships we need to introduce a way to group cards into a list instead of just X with Y.
So I'll just give an example of a really convoluted 5 card combo;
Griffin Canyon + Conspiracy + Candelabra of Tawnos + Karn, Silver Golem + Bubbling Muck.
So I feel like from a systems point of view you'd want to create an ID to reference to create a relationship between cards.
RelationshipID = "GC_C_CoT_KSG_BM".
Relationship = "Combos with"
EDHPoints=5

Then you'd be able to create an array of cards within that RealtionshipID;
Griffin Canyon
Conspiracy
Candelabra of Tawnos
Karn, Silver Golem
Bubbling Muck

Feyd I'm literally babbling out loud, but hopefully you'll see where I'm coming at from a universal approach to card relationships and how much can come out of it.

It would be cool from a users point of view that you can even show them card combinations (relationships) given their deck list.
Say they have Karn, Silver Golem in their deck list, you can look at up card array relationships and suggest cards that are combos with it.
This is really next level stuff from a Magic website point of view. You might have a cards in your deck that have quite a few relationships with the same card and so you can see these numbers and it will be a revelation that it combos with so many cards in your deck.
If you have 3 different cards that have a relationship to a card not in your deck, then its going to be a real flag for you to consider it.
Assuming we have to answer the last question with "yes", how complex do we go, and who drives all this data?
Do we go three cards deep? Four? Who lists all of these and rates their competitiveness?

These are the main three that need answered first. Please know these aren't rhetorical "why it can't happen" questions. If we can come up with strong answers, it makes it possible. The coding and math is actually the easy part to me. The database it's run against seems the hard part.
I guess I've tried to answer these with the previous question.
But as far as driving the data, it either comes off a database or more realistically because its unlikely you have the time to do an import system, a moderator puts these numbers in.

Now even though I've said the public as a whole should be able to create relationships between cards, as I'm personally looking to build up an infinite combo relationship of cards, I would do a lot of these myself. It would be me literally going through a spreadsheet adding the relationship to the website and ticking off as I go. So I would personally trying to do as many of the "Combos with" relationships.
And I would be assigning numbers for the point allocations to these for EDH "competitiveness" as I go, no matter how flawed and biased the numbers are.

Please don't take this request as gospel, its all hearsay at the moment, but it definitely gets the ball rolling as far as looking to the future.
Plus pokken and I would be building the algorithm (and dataset) outside of the website initially anyway, so don't feel a burden of "you must do this" in order for any traction on this idea to be done.
Last edited by darrenhabib 4 years ago, edited 1 time in total.

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

The more I think about it, if i were doing it as a one man show I would start with purely the individually ranked cards and leave the interactions for later.

I think you could get to 80% accuracy or better just by identifying the individually powerful cards. It's not Flash+hulk that makes CEDH decks powerful after all, it's far more often defending a Tymna, the Weaver or Mystic remora until they achieve overwhelming advantage. Or vampiric tutor finding the pieces.

I would probably try to tag all the cards as best I can.

I do really like Darren's relationship model -- basically you'd check every card in the deck for all of its relationships so if you have a Thassa's Oracle you might get 20 points for demonic consultation and 15 more for tainted pact. Pretty easy to identify the relationships uniquely so you don't get double charged for the reverse, just using umm, 3rd normal form i wanna say ;)


Edit: If you want an edit link to my master sheet so you can add cards by categories please let me know. I started a pretty serious list of cards that could get >1 point. I drew the line at nature's lore instead of cultivate, but I'm open to adding some more of course.

Mimicvat
Posts: 172
Joined: 4 years ago
Pronoun: he / him
Location: Auckland, New Zealand

Post by Mimicvat » 4 years ago

I think the actual 'pips' on a 1-10 scale are pretty meaningless. Personally I prefer to categorize the decks as "pure jank", "most decks", and "do I want to be here?".

No this isn't accurate. Yes you could game it easily by building a deck to capitalize on the rating system. And no, neither of those things really matter.

If its a theme deck, or built around an obviously trashy mechanic (Lantern Control, or Hive Mind Burn for example) its pretty much a jank deck no matter the card quality. I'll match it with my weaker decks or my own jank decks.

Then the next thing to determine is if it is in the 9/10 'dirty stuff' category. Does it run blue extra turn spells (the crappy lose-the-game ones don't count!), mass land destruction, stax or fast mana it goes in this category. If so, the specific granularity between a "9/10" or a "9.9/10" or whatever isn't really important - I'm gonna bust out my best deck (mono red with several combos lol), probably lose to it (but maybe not :grin:), then decide if it is worth continuing to play in that pod. Sometimes it is but usually if this is where people are at I'm going to look elsewhere as I don't own a deck expensive enough to run with the cEDH and cEDH-lite crowds.

Combo is where it gets a bit tricky. If every combo in the deck is 3+ cards, and the deck cannot tutor for them, slap it in "most decks". If the deck is running two card combos involving the commander, or where one or more pieces have flash, or where they possess several (ideally 1-2 mana) ways to find the pieces? Stick it in "dirty" and call it a day. If its somewhere between the two, lean on it being broken as hell and play the best deck you have.

Everything else is just 7/10 "normal decks". Doesn't matter if it has a random mana crypt or dirty tutor in it, they're probably in the same rough range as most of my decks and don't need special consideration to play against.
Currently building: ww Bruna, the Fading Light (card advantage tribal / reanimator)
Main decks;
r Neheb, Big Red Champion g Yeva's Mono Green Control, b Ayara's Aristocrats rb Greven, Predator Captain the One Punch Man, ugw Derevri, Empirical Tactician Aggro,rwbu Tymna & Kraum's Saboteurs, wbg Kondo & Tymna's Hatebears wugTuvasa's Silver Bullets, urBrudiclad does Brudiclad thingsgubSidisi, Brood Tyrant (lantern control)

Dragonlover
Posts: 554
Joined: 4 years ago
Pronoun: he / him

Post by Dragonlover » 4 years ago

Based off that 1-10 scale posted earlier (first time I've actually seen it written down), I'd peg my decks in the 7-8 range. They're definitely better than most precons, they've all had work put in, but they're not really running anything truly degenerate.

Dragonlover
All my decks are here

User avatar
folding_music
glitter pen on my mana crypt
Posts: 2271
Joined: 4 years ago
Pronoun: they / them

Post by folding_music » 4 years ago

I usually call my decks threes or fours out of ten! I don't really have many opportunities to test them so they're never refined into something competitive, they just exist in ice as a fancy stack of cards I like. When I read threads on here like the card of the day and people say "this is usually inferior to X" I tend to think "so what?", when people say every deck runs Sol Ring and Demonic Tutor I tend to think "do they?" One of my decks has a Whippoorwill in it because I think the card is cute.

that's not to say I don't run powerful things; my Siona deck unopposed can fill the board with soldiers or cats enough to kill an arbitrary number of players by turn five, but the deck isn't geared towards providing that experience every time. Most of my decks entirely eschew cards which tutor. What I like about the game is that there's so many different opening hands you can have and a lot of the optimising speak I see seems to be at war with that concept. it feels like the old concept of casual player has faded away a little.

User avatar
Rumpy5897
Tuner of Jank
Posts: 1859
Joined: 4 years ago
Pronoun: he / him

Post by Rumpy5897 » 4 years ago

I'm not gonna lie, I gobbled those brain vomit posts up very happily. Trying to objectively quantify the power level of a 100 card dump would be a great way to circumvent the common fallacy of everyone rating their decks at a 7 (as evidenced by the thread). I hugely encourage you to continue and offer any and all assistance I can provide that you may need.

Something the system needs to be able to quantify is the overall gameplay trends of the list. From my limited experience with cEDH lists, the top tier ones are a pile of ramp, card advantage, interaction and tutors with a small combo win package. The proportion and impact of all the constituents of a list should act as a reasonable surrogate for gameplay. Just because a deck runs tutors and the Mana Crypt doesn't mean it's the bee's knees, as e.g. its wincons or interaction could be deeply lacking by comparison.

Another thing to keep in mind is commander choice and synergy. Pointing Crimson Wisps at a Feather or Zada is infinitely different to doing the same in some Rx deck with zero synergy with it. One could argue that using EDHREC synergy measures here would be a decent proxy, acting as a multiplier to the score, but this would still take some ironing out. Would also probably need to penalise commander dependence in some way. Still, my Feather is a scourge of my group, offering reasonably quick protected wins that any objective calculator would have trouble reconstructing from a pile of Gods Willing and whatnot.

In manners independent of this quantification system, something that I've found to impact deck strength is board presence. My fastest deck often has zero board before going in for the kill, which makes taking it out easier.
 
EDH Primers (click me!)
Deck is Kill Club
Show
Hide

User avatar
Sinis
Posts: 2041
Joined: 4 years ago
Pronoun: Unlisted
Location: Toronto, Canada

Post by Sinis » 4 years ago

I think the more complicated the rating system, the worse it gets. I've played against a pretty wide variety of players in the last 12 years or so, and some things seem to be near-universal. My scale is as follows.
A Rating System
Show
Hide
100% (or cEDH): No holds barred. If it's legal, it's acceptable.

75%: Some things are banned in spirit: two-three card infinite combos (especially involving the commander), Land Destruction. Some general strategies (like Stax) are somewhat unwelcome.

50%: All Infinite combos are heavily frowned upon (unless they're like, 5 piece Rube Goldberg machines), some 'high-powered' cards that would never be played in cEDH are definitely unwelcome (stuff like Deadeye Navigator, Vorinclex, Expropriate, Jin-Gitaxias).

25%: We're playing tribal/theme decks that may have good synergies, but are probably not overwhelming. Tutors may be frowned upon.

0%: Unmodified Precon territory; these people treat EDH like a board game.
I think the most important features are how the game is played. If a deck wins on turn 5, is it just a really effective aggro deck? Or did it just have good draws? There's a Mazirek player in our group where games end really fast, or the writing is on the wall pretty early, but I wouldn't call it cEDH because it's not as consistent as established cEDH decks.

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

Rumpy5897 wrote:
4 years ago
Another thing to keep in mind is commander choice and synergy. Pointing Crimson Wisps at a Feather or Zada is infinitely different to doing the same in some Rx deck with zero synergy with it. One could argue that using EDHREC synergy measures here would be a decent proxy, acting as a multiplier to the score, but this would still take some ironing out. Would also probably need to penalise commander dependence in some way. Still, my Feather is a scourge of my group, offering reasonably quick protected wins that any objective calculator would have trouble reconstructing from a pile of Gods Willing and whatnot.
Capturing some of those really off beat pure synergy builds that actually are strong enough to trounce most 5/10 decks is going to be one of the most difficult things, because they play a huge pile of cards that are 0s in any other deck.

My suspicion is that in the medium term those could be captured by a commander point ranking. Zada and Feather are both strong enough to get some points.

But in the longer term, the synergy model that DH proposed is probably the best suited to handling that. For instance, giving Zada + each of her commonly played ridiculous bombs (e.g. Temur Battle Rage) a small point count might help capture that in Zada, crimson wisps is better than mystic remora.

What I would expect is that you'd need a specialist in each one of those weird synergy driven decks to plot them against what's out there. For instance whitemane lion is probably some number of points in Ephara or Karametra, but largely jank elsewhere.

(total hypothetical: If i were to give it a point value against what I've already plotted for rough point estimates, I would give it 7 points. Worse than rhystic study or mystic remora but close to as strong as mindblade render. )

User avatar
Rumpy5897
Tuner of Jank
Posts: 1859
Joined: 4 years ago
Pronoun: he / him

Post by Rumpy5897 » 4 years ago

Whitemane Lion is an interesting example to try to ballpark my proposed heuristics for. Hitting up its EDHREC page, most top commanders make sense. Most. For whatever reason, the top commander for it is Syr Alin, the Lion's Claw. Most of that commander's six decks are ELD tribal, but they also seem to sport a 100% include rate of Shelter, another Whitemane Lion'esque bombo synergy card that goes massive in Feather. Why? Don't know. But they're both there at 100%, leading to crazy synergy percentages.

Come to think of it, trying to crowdsource public wisdom for more obscure synergies isn't gonna fly. Checking Whip Silk here does not show Eutropia the Twice-Favored anywhere. Apparently the general brewing populace sports a 4/51 include rate of the card.

Yeah this is going to be trickier than hoped for.
 
EDH Primers (click me!)
Deck is Kill Club
Show
Hide

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

Rumpy5897 wrote:
4 years ago
Whitemane Lion is an interesting example to try to ballpark my proposed heuristics for. Hitting up its EDHREC page, most top commanders make sense. Most. For whatever reason, the top commander for it is Syr Alin, the Lion's Claw. Most of that commander's six decks are ELD tribal, but they also seem to sport a 100% include rate of Shelter, another Whitemane Lion'esque bombo synergy card that goes massive in Feather. Why? Don't know. But they're both there at 100%, leading to crazy synergy percentages.

Come to think of it, trying to crowdsource public wisdom for more obscure synergies isn't gonna fly. Checking Whip Silk here does not show Eutropia the Twice-Favored anywhere. Apparently the general brewing populace sports a 4/51 include rate of the card.

Yeah this is going to be trickier than hoped for.
Yeah, I think it's more likely to be successful having a single expert for each 'high synergy' general that gets identified.

What I'd expect is we'd start running decklists through the algorithm(s) and the gaps woulds be pretty self-identifying. Put Zada, Feather, or Karametra through it and you'll get really low scores most likely despite tuned decks being pretty good.

One thing I'm considering early on is just weighting the commander points quite a bit higher than the normal 0-100 scale. For instance having Tymna the Weaver in your hand all the time is probably worth a LOT more points than having a mana crypt in your 99. This basically just assumes that if you play a commander you will take some advantage of its capabilities, but I think that's not that unreasonable.

Sharpened
Posts: 193
Joined: 4 years ago
Pronoun: he / him

Post by Sharpened » 4 years ago

How much delineation is meaningful?

Look, the point of rating decks is to break the decks into categories/tiers so they can play against each other and have a good game. How many tiers do you need?

We have two "established" tiers to start with:
Unmodified Precons
cEDH

cEDH is obviously the top tier, that's basically its stated purpose. So how do you define/quantify where that tier ends?

How many meaningful levels are there between cEDH and a straight out of the box precon?
Are there any meaningful levels below a precon?
How much overlap is there among tiers? How do you communicate, evaluate that?

User avatar
kenbaumann
on and off since '94
Posts: 88
Joined: 4 years ago
Pronoun: he / him
Location: Santa Fe, New Mexico, USA
Contact:

Post by kenbaumann » 4 years ago

The questions I ask and answer about my decks to determine their power level:
  • What's the average CMC?
  • How many removal spells does the deck contain?
  • How many recursion spells does the deck contain?
  • Can the deck win at Instant speed?
  • Does the deck have one or more infinite combos?
  • If uninterrupted, anecdotally, when is the deck usually able to win?
Based on my answers, I come up with a 1 – 10 power level using my jankiest deck as an anchor (2/10) and my cEDH Urza deck as an anchor (9/10).

I hope this answer helps!

User avatar
pokken
Posts: 6354
Joined: 4 years ago
Answers: 2
Pronoun: he / him

Post by pokken » 4 years ago

Sharpened wrote:
4 years ago
How much delineation is meaningful?

Look, the point of rating decks is to break the decks into categories/tiers so they can play against each other and have a good game. How many tiers do you need?

We have two "established" tiers to start with:
Unmodified Precons
cEDH

cEDH is obviously the top tier, that's basically its stated purpose. So how do you define/quantify where that tier ends?

How many meaningful levels are there between cEDH and a straight out of the box precon?
Are there any meaningful levels below a precon?
How much overlap is there among tiers? How do you communicate, evaluate that?
Some good questions there. I think there are at minimum in my experience two tiers in between precon and CEDH - and this has been generally found by most people. The 5-adjacent and 8-adjacent decks are not going to play a lot of good games with each other. They surely can with nut draws, but play 10 games and the 5ish decks are going to get clobbered a lot of the time.

Understanding the boundaries is very different because decks act on such different axes. I'll give an example here that's pretty interesting:

If I play my Maelstrom Wanderer deck (6-7/10 imho) in a group of CEDH decks, despite it being fairly low power level, it will often steamroll the table. Why? It attacks on an axis that they are unprepared to defend - giant hasted creatures and extreme mana and value. If they neutralize each other at all I'll kill them all while they're trying to combo, or I accidentally kingmake someone if I guess wrong at who to kill first.

You'll see some very similar axis problems when 5 decks and 8 decks bump into each other; a common weakness my Ephara deck has is blood artist decks. Despite being able to win against very strong decks in its weight class, I'll sometimes just lose to not being able to keep from getting killed by 20 tokens and a blood artist.

In Legacy and Modern I've commonly referred to this effect as the "ships passing in the night" problem where decks feel like they're operating on axes that do not really intersect.

It's one of the most difficult things to assess when talking power levels in EDH; you may wind up with decks that are both 8/10 but woefully mismatched. Sigarda Enchantress vs. Bruna Enchantress for example, is likely to be a rather unpleasant game for Sigarda. :)

Post Reply Previous topicNext topic

Return to “Commander”