8000 Importing data from mlox · Issue #114 · loot/morrowind · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Importing data from mlox #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Ortham opened this issue Feb 15, 2025 · 11 comments
Open

Importing data from mlox #114

Ortham opened this issue Feb 15, 2025 · 11 comments
Assignees

Comments

@Ortham
Copy link
Member
Ortham commented Feb 15, 2025

Mlox is the most popular load order utility for Morrowind mods. It's inspired by BOSS but takes an approach that's more similar to LOOT (IIRC the two were developed independently though, I don't think I've ever looked into what mlox actually does before now), in that it defines load order rules that define the relationships between plugins instead of using a total load order, and then sorts plugins in a graph.

Mlox's base rules (equivalent to LOOT's masterlist) contains a lot of data: there are 7966 rules, 3710 of which affect load order (the rest are for messages). A direct comparison is difficult, for reasons I'll get into, but LOOT's masterlist only contains 80 plugin entries.

(There are also more recent rules at https://github.com/DanaePlays/mlox-rules, but that repo doesn't have a license and I haven't yet asked if it would be OK to port them.)

It would be good if mlox's rules could be translated into LOOT metadata:

  • It would mean that LOOT would be more useful to users who prefer it
  • It would make the data available to programs that already use libloot to access masterlist metadata

Users can just use mlox of course, but there's no mlox library that developers can use to access mlox's data: I'm only aware of mlox itself and plox as existing parsers of mlox's rules file format. mlox's code can't be used to just parse a rules file because it intertwines parsing and evaluation, while plox seems like it's in a better position as a starting point but I think would need changes to be useable as a library.


So, I've written my own parser (currently in the import-mlox.py script in the mlox branch) that seems to work - it doesn't follow the mlox documentation, because the docs don't agree with the actual base rules file, and mlox itself allows for more syntax variations than my parser does, but those variations don't seem to be used in practice. I've found that parsing isn't actually that complicated, except when you get to the expression syntax, because it uses expression type delimiters that can appear in filenames ([ and ]), and doesn't delimit filenames except with whitespace (which can also appear in filenames), so it gets a little tricky to tell where one thing ends and another begins.

The script I've written parses the rules file into an AST, and then attempts to convert that into a masterlist file. The parsing side is self-contained and just over 400 lines with type definitions, so may be reusable. It doesn't preserve comments, but I had a look at some and very few of them seem potentially useful, so it's probably best to manually port any that are.

The conversion side is more difficult. While mlox and LOOT are quite similar, there are some mlox features that LOOT doesn't support:

  • All filenames in mlox rules can contain special characters that are effectively regex. LOOT doesn't support regex in filenames for load after, requirement and incompatibility entries. I don't really want to change that, because I think the specificity is helpful (the rules don't depend on your local environment) and supporting regex would significantly complicate how the metadata is used. Unfortunately there are a lot of filenames that use special characters: 354 warnings are logged for after metadata, 38 for req and 649 for inc. To put that in perspective, that's 1041 warnings of invalid filenames vs. 27657 valid filenames, so although it's a large number it's still less than 4% of metadata entries.
  • mlox's equivalent to LOOT's version() condition function also supports extracting conditions from filenames, which LOOT does not, which is also why there are 39 regex warnings for them (if the filenames didn't use regex then they'd have the version hardcoded, so no need to check it). I'd be open to adding another function that does the same thing, e.g. filename_version(regex, version, operator).
  • mlox has a SIZE predicate that LOOT has no equivalent of. There are 752 warnings about it not being supported. I'd be open to adding a file_size(path) condition to support this functionality. It seems to be used to distinguish between plugins with the same name: given the choice, I think LOOT's checksum() condition function is better for that.
    • One instance of the SIZE predicate uses a regex filename, which doesn't make much sense to me, I think that should just be manually replaced with the actual filename of the file of that size, or replace it with a regex file condition, because the message it's attached to isn't very specific anyway.
  • mlox has a DESC predicate that LOOT has no equivalent of. This causes 178 warnings (none for regex filenames), and again I'd be open to adding a description_contains(path) function to cover this functionality.
  • mlox has an additional level of message highlighting that sits between LOOT's say and warn levels. I don't see there being enough difference to warrant a distinction between normal messages and "low priority" messages, so I don't see the point of adding another level to LOOT: the script treats "low priority" messages as say messages.
  • mlox messages apparently support spoiler text, using <hide>spoiler</hide> tags, which LOOT doesn't. The script could convert that to something like <span class="spoiler">spoiler</spoiler>, but LOOT currently strips out HTML when displaying messages, so that would need to be disabled, and it would also need to have some styling rules added for the spoiler text to actually work as intended. However, there is only one message in mlox's base rules that actually uses the <hide> tag and I think it would be better just to strip out it and its contents, it doesn't add enough value for the effort to be worth it.

The script logs warnings whenever it encounters one of those differences.

Aside from those differences, there's a significant structural difference: LOOT's metadata is generally plugin-centric (e.g. most messages and all load after, incompatibility, requirement, etc. metadata sits within plugin objects), while mlox's is not.

For example, if you've got a [Requires] rule you need to parse its expressions to identify which plugins are dependents and which are dependencies, and because both are defined in expressions you can have a single rule defining relationships between groups of plugins that are quite complex (e.g. "this group contains these plugins and one or more of these plugins but only when these plugins don't exist, or when this plugin's version is this value and that other plugin's size isn't this value..."). Because the expressions also decide when mlox should display an error, they also need to be correctly converted to LOOT metadata conditions.

The way that mlox messages are displayed (relatively rudimentary compared to LOOT's approach, IMHO) also means that many of the messages are written so that the only make sense if you can see the expressions that they were defined for, which means they need a bit of deciphering. I haven't fully solved that problem in the script, but I think it's doable.

This structure makes the mlox rules much more concise: the LOOT metadata YAML produced by my script is currently more than twice the size of the mlox rules file (and more than 5x the size of the other masterlists!). It also makes it hard to unpick and convert the rules to a more plugin-centric form, so I've been focussing on the simpler expressions and building up to more complex expressions. As it stands, the script can convert about 94% of the rules, leaving 489 in need of manual conversion, which I think is good enough that manual conversion of those is practical (if a bit of a slog). I think I can improve coverage further, but I'm not aiming for 100%.

Aside from that, the formatting of the script's output could be improved:

  • Markdown special characters are escaped, but that's often unnecessary
  • it would be better if multi-line messages were formatted using text blocks instead of single-quoted strings
  • more line breaks, e.g. between plugin entries, would help readability
  • while anchors and aliases are used, they're given meaningless names, and they only seem to be used for repeated objects, not repeated strings, so I wouldn't be surprised if a lot of message text could be deduplicated

I'm hoping those improvements can be done in code, rather than manually.


Beyond the ability to translate between the mlox and LOOT metadata formats, it's also worth noting that mlox allows its rules to produce cycles, and will just ignore whichever rule is about to cause a cycle when it applies them at runtime. In contrast, LOOT expects its metadata (aside from group metadata) not to cause cycles and sorting will fail if they do. That means that a straight translation of mlox's [Order] rules might end up causing a lot of cyclic interaction errors.

I think that's fixable. The good news is that mlox doesn't seem to sort based on any plugin data, so it should be possible to brute-force finding cycles by building a graph out of all the order rules and scanning through that, and once found the cycles can be avoided by making the equivalent LOOT metadata conditional as necessary. I'm not sure how practical that is though. Not relying on plugin data also has the downside that the metadata also includes plugin masters, which is unnecessary for LOOT, but they can't be automatically stripped out without having the plugins to read.

Finally, mlox has NearStart and NearEnd rules that the script translates into Near Start and Near End groups that sit on either side of the default group: this should give similar behaviour, but it probably won't be exactly the same as the rules don't work the same way as groups, and I don't want to add complexity trying to bridge the gap.


So, in summary:

  • 94% of the mlox base rules can currently be converted, leaving 489 at least partially unconverted
  • There's a lot of data: the input file is about 1.6x the size of LOOT's largest masterlists, and the output is over 3.5x that!
  • The output formatting is a little rough, but I think can be improved
  • some of mlox's messages don't make sense if you can't see their accompanying expressions: I think I can enhance the conversion to address that
  • mlox supports conditional logic that can check things LOOT can't, but I'm willing to extend LOOT to provide equivalent functionality
  • mlox supports a "low priority" message type that LOOT doesn't: I think it's best to treat these as normal say messages
  • mlox supports spoiler text in messages and LOOT doesn't: I don't think it's worth supporting as it's only used once, and I'll update the script to just strip the text entirely
  • the big sticking point is that mlox supports regex in filenames for its equivalent of after, req and inc metadata, which I don't want to support. That affects ~ 1041 metadata entries, which is a large number but only < 4% of the total number of filenames added as after, req or inc metadata.
  • even after conversion, the metadata may give different results or cause cyclic interaction errors due to differences in how mlox and LOOT work
@Ortham
Copy link
Member Author
Ortham commented Feb 16, 2025

I've implemented the file_size(), filename_version() and description_contains() condition functions in loot-condition-interpreter, so once that makes its way into a libloot release LOOT's conditions will be able to express almost everything that mlox expressions can, aside from:

  • checking versions in plugin descriptions using a regex path: mlox tries to read a version from the description first then the filename. There are only a small number of cases that take a regex path, and while it's ambiguous whether they are expected to find anything in the description (I'd guess no, because it's easier for a mod author to update and remember to update a filename than a version in the plugin description), I'd rather have that ambiguity resolved manually than bake it into LOOT's metadata.
  • checking file size using a regex path: there's only one instance of this, and it doesn't seem like it's adding much value. I'd rather manually resolve the ambiguity over which files could possibly have the size that's checked for.
  • checking description contents using a regex path: there aren't any cases where that functionality is used.

I've also updated the script to strip out <hide></hide> tags and their contents, since I don't think they're worth supporting.

I've also investigated whether the mlox [Order] rules cause cycles, by building a graph from all the [Order] rules and then checking if it's acyclic: it turns out it is, so that's not a problem. Though that was just treating all the plugin names as literals, not matching special characters, so maybe things fall apart when you do that.


The conditions that contain a filename_version() function are:

filename_version("LGNPC_TelUvirith_v<VER>_UI.esp", "1.31", <)
(filename_version("LGNPC_PaxRedoran_v<VER>.esp", "1.21", <) or filename_version("LGNPC_TelUvirith_v<VER>.esp", "1.20", <)) and file("Rise of House Telvanni.esm")
filename_version("DBAttack Tweaked <VER>.esp", "1.4", <)
filename_version("BetterMusicSystem_<VER>_alt.esp", "1.9.1", <)
filename_version("Balmora University V<VER>.esp", "2.3", <)
filename_version("Texture Fix <VER>.esm", "2.0", <)
filename_version("Uvirith's Legacy_<VER>.esp", "3.1", >)
filename_version("Uvirith's Legacy_<VER>.esp", "3.2", >)
filename_version("Class Abilities <VER>.esp", "3.1", <)
filename_version("Fishing Academy v<VER>.esp", "2.54", <)
filename_version("LGNPC_TelMora_v<VER>.esp", "1.30", <)
filename_version("KS_Julan_Ashlander Companion_<VER>.esp", "2.0", <)
filename_version("LGNPC_AldVelothi_v<VER>.esp", "1.20", <)
filename_version("LGNPC_GnaarMok_v<VER>.esp", "1.20", <)
filename_version("LGN
8000
PC_Indarys*Manor_v<VER>.esp", "1.51", <)
filename_version("LGNPC_Khuul_v<VER>.esp", "2.21", <)
filename_version("LGNPC_PaxRedoran_v<VER>.esp", "1.20", <)
filename_version("LGNPC_Secret*Masters_v<VER>.esp", "1.30", <)
filename_version("Nevena's Twin Lamps & Slave Hunters *.esp", "1.5", <)
filename_version("Pegas Horse Ranch v<VER>.esp", "3.1", <)
filename_version("Texture Fix <VER>.esm", "2.0", <)

Based on other rules in the base file, Nevena's Twin Lamps & Slave Hunters *.esp could also be written using <VER> instead of *, as matching substrings from other rules are 1.2 and 1.5.

The rule with the SIZE regex path is:

[NOTE]
 The latest version of Slartibartfast's "Texture Fix - Balmora Expansion" is 1.4 (Balmora Expansion v1.4+(1.4).esp).
 mlox cannot deduce the exact version number you are using but please double-check you are running Slartibartfast's latest.
[SIZE !4495981 Balmora Expansion v1.4+(<VER>).esp]

and the rule with the <hide></hide> tags that now get stripped out is:

[Note]
 Both these mods add the same Armor of St Nerevar (as created by Enlightened_Daedroth and Soulshade), though the (short) quests to obtain it are different <hide>and they also add different enchantments to the armor</hide>.
[ALL NerevarArmor-v30.esp
     St. Nerevar'S Armor.esp]

@pStyl3
Copy link
Member
pStyl3 commented Feb 16, 2025

You seem to progessing with a steady pace, which is intriguing and great to see. One has to wonder how the final result of the conversion process will look like.

That being said, I think there is an elephant in the room, which I feel the need to address better now than later. The issue I'm talking about it outdated and dead data. When we were recleaning every single mod for that we had cleaning data in the various masterlists a couple of years ago, we were forced to first find literally hundreds upon hundreds (if not thousands) of mods in the first place. Until that point their location data wasn't stored in the masterlist, so it was a huge task to find them in the first place.

Let that be as it was, one learning from that was, that a lot of plugins that we had data for, particularly in the skyrim and oblivion masterlists, weren't anylonger accessible anywhere. They were deleted at some point in time, so there was no way to download them.

I'm not exactly sure if this was the reason, or if we had thought about this before working on the above mentioned task, but since then we have implemented a rule, that we only add new plugin entries, if we know the source of the plugin. We want to know at all times, where a plugin can be downloaded from, if we have data on it, since it is an integral part of working on the masterlists, to be able to reevaluate plugins at a later date.

So, if we want to add a new entry to one of our masterlists, we do it like this:

  - name: 'Tamriel_Data.esm'
    url: [ 'https://www.nexusmods.com/morrowind/mods/44537/' ]

Or even better:

  - name: 'Better Propylon Teleport Warp(-Master Index)?\.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/46364/'
        name: 'Pikas Miscellaneous Mods'

Looking at mlox_base.txt there are currenly

  • 27415 entries of .es (so plugin entries that can be either .esp or .esm)
  • 171 urls starting with http://www.nexusmods.com/morrowind/mods/

These numbers don't necessarily need to be absolutely correct, but it does show the discrepancy.

Now, does finding plugin sources for many thousands of plugins sound very appealing to me? No, of course it doesn't. But the question really becomes, even if the conversion process is done and the result looks kind of neat - how much of that converted data actually connects to plugins, that are still available to this day? And how much converted data is for plugins, that are long since gone?

I really would like to see help from the community for this. We really would need people, experienced Morrowind modders, to help us identifying where plugins come from - or which plugins are no longer available anywhere.


All that being said, to have a starting point, I have gone through Nexus Top 60 Mods and put together the following list of plugins, that currently are not in LOOT's masterlist, but in mlox_base.txt. I've created entries for them in LOOT's format, though so far I haven't created the corresponding regular expressions (Regex).

Plugins

Morrowind Crafting

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/47392/'
        name: 'Morrowind Crafting'

Morrowind Crafting 2-1.esp
Morrowind Crafting Equipment.esp

Taddeus' Necessities of Morrowind

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/52158/'
        name: 'Taddeus'' Necessities of Morrowind'

NOM 3.0.esp
NoM Modders Res.esp
NoM_Creatures Loot PHW.esp
NoM_Creatures Loot Standard.esp
NoM_Grass FotG+FF.esp
Vurt's Groundcover for NoM - BC, AI, WG, GL.esp
Vurt's Groundcover for NoM - Solstheim [Lush version].esp
NoM_CoM Compatibility Patch.esp
NoM_IO Compatibility Patch.esp
NoM_LL Compatibility Patch.esp
NoM_MC Compatibility Patch.esp
NoM_MC Lists for Merging.esp

MGE XE

  - name: 'XE Sky Variations.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/41102/'
        name: 'MGE XE'

Better Bodies

  - name: 'Better Bodies.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/3880/'
        name: 'Better Bodies'

Better Heads

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42226/'
        name: 'Better Heads'

Better Heads.esm
Better Heads Tribunal addon.esm
Better Heads Bloodmoon addon.esm

Morrowind Comes Alive

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/6006/'
        name: 'Morrowind Comes Alive'

MCA.esm
MCA - BB Peasant Gowns Addon.esp
MCA - BadKarma Clothing Vendor Addon.esp
MCA - COV Addon.esp
MCA - Hilgya the Seamstress Addon.esp
MCA - Illy's OMG Addon.esp
MCA - Khajiit Diversity Revamped Addon.esp
MCA - TR Addon.esp
MCA - Vampire Realism Patch.esp
MCA - Westly's FCOT Addon.esp

Morrowind Rebirth

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/37795/'
        name: 'Morrowind Rebirth'

Morrowind Rebirth [Main].ESP
Morrowind Rebirth - Birthsigns [Addon].ESP
Morrowind Rebirth - Game Settings [Addon].ESP
Morrowind Rebirth - Mercenaries [Addon].ESP
Morrowind Rebirth - Races [Addon].esp
Morrowind Rebirth - Skills [Addon].ESP
Morrowind Rebirth - Tools [Addon].ESP
Julan Ashlander Companion 2.02 [For Rebirth].esp
Julan Ashlander Companion 3.0 beta [For Rebirth].esp
Morrowind Patch Project v1.6.6 [For Rebirth].esm
Project Cyrodiil [For Rebirth].esm
Siege at Firemoth [For Rebirth].esp
Skyrim Home of the Nords [For Rebirth].esm
Uvirith's Legacy 3.53 [For Rebirth].esp

Better_Clothes Official Version

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42262/'
        name: 'Better_Clothes Official Version'

Better Clothes_v1.1.esp
Better Clothes_v1.1_nac.esp

Better Bodies for NMM

  - name: 'Better Bodies.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42399/'
        name: 'Better Bodies for NMM'

Real Signposts

  - name: 'RealSignposts.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/3879/'
        name: 'Real Signposts'

Run Faster - Faster Running Speed

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42796/'
        name: 'Run Faster - Faster Running Speed'

RunFaster-LightSpeed.ESP
RunFaster-LightSpeedx2.ESP
RunFaster-SpeedofSound.ESP
RunFaster-Fast.ESP
RunFaster-Faster.ESP
RunFaster-FasterPlus.ESP
RunFaster-Fastest.ESP

Better Morrowind Armor

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42509/'
        name: 'Better Morrowind Armor'

Complete Armor Joints.esp
Better Morrowind Armor DeFemm(a).ESP
Better Morrowind Armor DeFemm(o).ESP
Better Morrowind Armor DeFemm(r).ESP
Better Morrowind Armor.esp
LeFemmArmor.esp
Snow Prince Armor Redux.E
8000
SP

THE Facepack Compilation

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/14435/'
        name: 'THE Facepack Compilation'

THE Facepack Compilation.esp
THE Facepack Compilation.esm

Fair Magicka Regen v2B

  - name: 'Fair Magicka Regen 2.0b.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/39350/'
        name: 'Fair Magicka Regen v2B'

Robert's bodies

  - name: 'Robert''s Bodies.ESP'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/43138/'
        name: 'Robert''s bodies'

Graphic Herbalism

  - name: 'Graphic Herbalism.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/43140/'
        name: 'Graphic Herbalism'

Delayed DB Attack V2

  - name: 'DB_Attack_Mod.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/14891/'
        name: 'Delayed DB Attack V2'

Romance English Version

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/6932/'
        name: 'Romance English Version'

Bisexual_v10EV.esp
Prostitution_v12EV.esp
Romance_Follow_v10EV.esp
Romance_v37EV.esp

Better Robes

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/42773/'
        name: 'Better Robes'

Better Robes.ESP
Better Robes TR.esp
UFR_v3dot2.esp

Vurt's Groundcover

  - name: 'Vurt''s Corals.esp'
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/31051/'
        name: 'Vurt''s Groundcover'

Skyrim Home Of The Nords

  - name: ''
    url:
      - link: 'https://www.nexusmods.com/morrowind/mods/44921/'
        name: 'Skyrim Home Of The Nords'

Sky_Main.esp
Sky_Main.esm
Sky_Main_Grass.esp

In the list of plugins above there is one plugin, that demonstrates that mlox's masterlist also contains old data. The mod Skyrim Home Of The Nords includes Sky_Main.esm in it's main file. On the other hand Sky_Main.esp is only to be found in an old version of the mod. mlox_base.txt however only knows the .esp version, not the newer .esm version.

@Ortham
Copy link
Member Author
Ortham commented Feb 17, 2025

That being said, I think there is an elephant in the room, which I feel the need to address better now than later. The issue I'm talking about it outdated and dead data.

My goal so far has basically been to make it as easy as possible to convert mlox rules to LOOT metadata, which neatly sidesteps any issues of input data quality. I agree that the data quality is a huge issue when it comes to actually maintaining the data though.

FWIW, the mlox_base.txt that you linked to and that I've been using hasn't been touched in 6 years, but Danae's rules files seems to have had fair bit of attention since then, so are probably in better shape (and Danae's mlox_base.txt is smaller, so maybe some of the dead data has been stripped out). I haven't really looked into them though, so I can only really comment on the older rules.

I agree with the rule of only adding entries for which you have a URL. If we were to import the mlox metadata though, I don't think it would be practical to enforce that for the imported metadata: there's just too much of it (21236 plugin entries and counting). I think we'd need to make a distinction between adding new metadata, and porting existing metadata, where in the latter we'd basically be aiming for parity with mlox as the goal, with the understanding that we could then opportunistically improve on things after that.

On the subject of data quality, the messages in mlox have a variety of issues, relative to how we do things in LOOT:

  • Some messages provide no context, and must expect the user to be able to see the expressions that are associated with the message, which isn't great UX. E.g. Use only one of these plugins.
  • There's a lot of practically useless "Ref" references that give the filename of a readme but no indication of where to download it from.
  • Some messages put square brackets around part of the message, but AFAICT that's not interpreted as markup, they're just square brackets.
  • Some messages are also partially or completely wrapped in single or double quotes, apparently for no reason.
  • There is no special metadata type for dirty info, so messages about that are mixed in with others.
  • There are a lot of repeated messages, or messages that are worded or formatted slightly differently but not meaningfully so.
  • Some messages include URLs, but because there's no markup, they're just bare URLs while in LOOT messages we like to embed them as hyperlinks within text.

So there's a lot that could be done to improve on the source data, but a lot of that is due to the limitations of the mlox rules format. I wonder if it would make sense for the mlox maintainers/community to have the ability to derive mlox rules from the LOOT masterlist, so they could take advantage of our masterlist structure, and there could be a single source of truth that gets improvements? That's putting the cart a long way before the horse though.


One thing that might help with finding plugin URLs and also with matching file regexes to plugin names would be if we could download a dump of the file contents for all the downloads of all the Morrowind mods on Nexus Mods: a dump of all the downloads themselves would probably not be feasible (I don't know how large that would be, but I'm guessing 100s of GBs to a few TB), but each downloadable archive has a "Preview file contents" link that lists the files in the archive, and I'm guessing that's stored somewhere when you upload a file, and that list includes plugin names, so we could associate plugin names to Nexus URLs using that data.

That wouldn't help with any mods that aren't on Nexus Mods, but my guess is that it's got most of them. It might be worth us asking Picky or someone else at Nexus Mods if that would be possible. It would also be more efficient for them than serving us all those page views as we manually check page after page.

@Ortham
Copy link
Member Author
Ortham commented Feb 17, 2025

I've gone back to checking for cycles because I only ran the check treating all plugin names as literals, which isn't right because some are patterns.

Unfortunately, there are cycles when also checking for pattern matches. I ran the checks in 3 stages:

  1. Comparing all plugin filename strings using case-insensitive equality. This produces no cycles.

  2. For all the strings that were actually patterns (i.e. contained ?, * or <VER>), I turned them into regexes and then found all the matching non-pattern strings. Then when adding edges to the graph, if the source or target strings were patterns, I added edges between them but also between all their matching strings. This produces cycles, and the minimised graph containing just the entries involved in those cycles is large enough to be a little awkward but is still readable:

    Image

  3. For all the patterns, I also compared them to find overlaps (e.g. foo*.esp and *bar.esp overlap because both match foobar.esp) and added edges between all overlapping patterns. The minimised graph is too complicated to really be readable, but here it is:

    Image

So, what does this mean?

  • It's not a problem with the current state of the import script, because LOOT doesn't support having regex in the its after metadata, so wouldn't match any of the patterns anyway.
  • I bet a lot of these cycles don't actually exist, and are due to mlox providing only limited ability to pattern-match on plugin names: with only ?, * and <VER>, there's a vast gap in capabilities between ? and *, so you end up with patterns that are far more generic than they could be if you could write them in regex.
  • More broadly, I think this demonstrates that not supporting similar patterns in after metadata is the correct call, because it means we mostly avoid this issue (we can still cause cycles through bad metadata, but only between specific plugins so it's easy to find and fix). Supporting regex would mean that you can be more restrained than sprinkling .* everywhere, but even so I think there's a general tendency to write regexes that match more than you need them to.

@Ortham
Copy link
Member Author
Ortham commented Feb 17, 2025

I had a quick look at the URLs in the mlox rules, using grep to find and filter them:

grep -Po 'https?://(?!www.nexusmods.com|morrowind.nexusmods.com|gmml.pbwiki.com|www.mwmythicmods.com|planetelderscrolls.gamespy.com|www.bethsoft.com|forums.bethsoft.com|mw.modhistory.com|download.fliggerty.com|www.uesp.net|www.youtube.com|wryemusings.com|lovkullen.net|forums.nexusmods.com|www.theassimilationlab.com|wiki.theassimilationlab.com|www.fliggerty.com|www.msu.edu|btb2.free.fr|www.angelfire.com|www.lgnpc.org|yacoby.silgrad.com|tamriel-rebuilt.org|escf.rethan-manor.net|www.automatichamster.com|abitoftaste.altervista.org|www.pirates.retreat.btinternet.co.uk|www.calislahn.com|wrye.ufrealms.net|www.freewebs.com|webpages.charter.net|morrgraphext.wiki.sourceforge.net|www.elricm.com|www.ladymoiraine.com|www.mw.yacoby.net|www.doupe.cz|wolflore.net|www.ornitocopter.net|fallingawkwardly.com|www.sheikizza.boneflower.com)\S+' mlox_base.txt | sort > urls.txt

That leaves 32 unfiltered URLs that I haven't checked the domains of.

Of the URLs that I filtered out:

gmml.pbwiki.com
www.mwmythicmods.com
planetelderscrolls.gamespy.com
www.bethsoft.com
forums.bethsoft.com
www.theassimilationlab.com
wiki.theassimilationlab.com
www.msu.edu
btb2.free.fr
yacoby.silgrad.com
wrye.ufrealms.net
www.freewebs.com
www.elricm.com
www.ladymoiraine.com
www.mw.yacoby.net
www.ornitocopter.net
www.doupe.cz
wolflore.net

are domains that have since lapsed, been retired, or otherwise no longer serve their original purpose.

mw.modhistory.com
download.fliggerty.com
www.fliggerty.com
www.pirates.retreat.btinternet.co.uk
webpages.charter.net

are domains that I got timeouts on (I think I vaguely recall Great House Fliggerty shutting down).

Also, escf.rethan-manor.net gave me a 500 Internal Server Error, and I got a "no cipher overlap" error when trying to connect to morrgraphext.wiki.sourceforge.net.

Finally,

www.nexusmods.com
morrowind.nexusmods.com
www.uesp.net
www.youtube.com
wryemusings.com
lovkullen.net
forums.nexusmods.com
www.angelfire.com
www.lgnpc.org
tamriel-rebuilt.org
www.automatichamster.com
abitoftaste.altervista.org
www.calislahn.com
fallingawkwardly.com
www.sheikizza.boneflower.com

are domains that still point to modding sites (though individual pages may have moved or been deleted).

I'm not doing anything with this info right now, just wanted to record it so I could come back to it in the future.

@Ortham
Copy link
Member Author
Ortham commented Feb 17, 2025

@Pickysaurus
Copy link

One thing that might help with finding plugin URLs and also with matching file regexes to plugin names would be if we could download a dump of the file contents for all the downloads of all the Morrowind mods on Nexus Mods: a dump of all the downloads themselves would probably not be feasible (I don't know how large that would be, but I'm guessing 100s of GBs to a few TB), but each downloadable archive has a "Preview file contents" link that lists the files in the archive, and I'm guessing that's stored somewhere when you upload a file, and that list includes plugin names, so we could associate plugin names to Nexus URLs using that data.

That wouldn't help with any mods that aren't on Nexus Mods, but my guess is that it's got most of them. It might be worth us asking Picky or someone else at Nexus Mods if that would be possible. It would also be more efficient for them than serving us all those page views as we manually check page after page.

I've submitted a request to our team. They're quite busy at the moment so this could take a while.

@Ortham
Copy link
Member Author
Ortham commented Feb 18, 2025

The import script is now able to convert everything in mlox_base.txt and Danae's mlox_base.txt and mlox_user.txt, though one invalid rule first needs to be manually fixed in the first file (on line 30041), and there's a note rule with no message in the second file (on line 7515) that can't be converted.

The script doesn't handle all the possible expression combinations, but it should log a warning if it encounters a rule that it can't fully convert (along with other warnings for all the filenames that use special characters in places LOOT doesn't support that).

@Ortham
Copy link
Member Author
Ortham commented Feb 21, 2025

I've tidied up and published the code I used to check for cycles in mlox's rules above as mlox-cycle-finder, and while doing that I realised that I could reduce the complexity of the output graphs some more, so have updated the comment above with new images that are both now small enough for GitHub to render.

@Ortham
Copy link
Member Author
Ortham commented Feb 22, 2025

I've asked in Discord about the possibility of getting a license added to Danae's mlox rules repo.

@Ortham
Copy link
Member Author
Ortham commented May 2, 2025

With the release of LOOT v0.26.0, the metadata syntax extensions I've made to support equivalents to mlox's expressions are now released. While there are still other things that mlox supports and LOOT doesn't, I think what's left is stuff that I don't intend to change:

  • regex in filenames for load after, requirement and incompatibility entries
  • "low priority" (i.e. between notes and warnings) messages
  • spoiler text
  • skipping rules that would cause cycles

I've got some other stuff that's higher priority, but intend to come back to this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants
0