Wednesday, December 15, 2021

Quality check

Recently I ran a CueTools scan on an entire previous version of TLMC, as I was planning to do since I discovered I had this option. Not exactly the v.19, as it was already slightly dismantled (~25 albums replaced with better versions, renames applied), but something pretty close, say 99+%. The results were quite unexpected for me. If you're curious you can download full check logs (the checker might've also grabbed some extra files from bonuses) and a summary in both original text format and a table. Visual html+css table form uses some heuristics to transform summary text into color-coded quality representation, so it's not entirely precise. If in doubt refer to the detailed logs. You could also run the check yourself, although be prepared to let the program single-threadedly crunch numbers for a day or so, the bottleneck is the audio decode time, which runs at ~200x realtime on my CPU for the TTA codec. My logs slightly differ from the ones produced by the original Cuetools, because I patched it a little for my convenience, but the change only displays calculated offset in CTDB checks, which is not shown in the unmodified version.

The summary of the results is as follows: about 20% of all disc images fail verification. Here by "fail" I don't mean "disc was not found", I mean "was found to be different and unrepairable". Oh, and completely by coincidence the peak level of the supermajority of such rips is almost exactly 98%. And by another coincidence normalization to 98% is the default value in an admittedly turned-off-by-default setting in "Exact"AudioCopy. In addition to those there are another 15% of disc images, which also have peak level of 98% and return "disc not found" and it'd be fair to assume they would also fail to verify.

Lessons to be had here:
1. Not only it is your duty as a software developer to provide sensible defaults, doing only that and no more is clearly insufficient to produce reasonable outcomes. Out of 10 monkeys that see an unknown lever at least 3 will pull it and leave it there. Any option that does anything unpredictable to the dumbest possible user should be hidden behind a curtain that requires a certain intelligence level to bypass, which is enough to understand what the thing that was hidden actually does. In our particular case it could have been a domain-specific scripting language interpreter window for general purpose audio transforms, not a "please ruin my rips" on-off switch in plain sight. Of course there is another danger that by simply removing anything the unfortunate could use to shoot their feet off, instead of properly hiding it because it is easier to do so, you risk turning your software into iCrap.
2. We don't know how badly mangled these rips are.
Maybe they were improperly done and there are stutters or clicks or anything else that wasn't in the original CDs.
Maybe the only problem is the normalization, so it is displeasing on an intellectual level, but would not be audible as a defect.
3. In my defense those album versions were the only ones available, so the only alternative to including these rips was not to include them. I could not have done anything differently, had I known about it, but I wasn't aware of the problem and its extent. Now I am and so are you.

21 comments:

  1. I guess I need to rip and upload a few more CDs.

    ReplyDelete
    Replies
    1. https://nyaa.si/view/1468198
      https://nyaa.si/view/1468199
      https://nyaa.si/view/1468200
      https://nyaa.si/view/1468201
      https://nyaa.si/view/1468202
      https://nyaa.si/view/1468206
      https://nyaa.si/view/1468209
      https://nyaa.si/view/1468210
      https://nyaa.si/view/1468211
      https://nyaa.si/view/1468213
      https://nyaa.si/view/1468214
      https://nyaa.si/view/1468215
      https://nyaa.si/view/1468216

      Delete
  2. I should put up a detailed accurip result interpretation post some time later, but for now the most important thing I should note is that in the absense of other information "AR: rip not accurate" does not mean much, because it needs content + offset match to say "accurate" and correct content + wrong offset situation is mostly fine and trivial to fix.

    ReplyDelete
  3. Will this delay the v.20 release?

    ReplyDelete
    Replies
    1. Nope. The files will just stay, which requires no effort from me, until someone shares a better version.

      Delete
  4. Hey rwx, I don't know where to contact you elsewhere, but I have some albums rips that aren't already in TLMC. I also have some various scans of albums that are currently in TLMC, as I own them physically. Can you tell me where I can contact you further?

    ReplyDelete
    Replies
    1. You're already contacting me, I don't know how you can do it further. If you want a private contact method you can write to xmpp/jabber address rwx@headcounter.org, but just to donate rips you don't even need that - upload them somewhere and share a link in comments, preferably in the "Pre-release post" thread. Additional upside of that is that people can grab it themselves without waiting for the torrent, although in this case it's less important, as I think I'm wrapping up and plan to share the next version in several weeks [*]. I don't have a dedicated upload server, as most of the time I was operating in pull mode, if that's what you're interested in.

      [*] Nearly every single one of my previous estimates was off in a certain direction, take note.

      Delete
  5. Okay awesome. Thanks, and I'll get the rips and images to you as soon as possible.

    ReplyDelete
  6. [DiGiTAL WiNG]\2013.05.26 [DWCD-0008] BEST OF WiNG [例大祭10] is outright broken; at about 3:02 into the fifth track, it suddenly cuts into a different song and the final two tracks in the .cue don't play, as if someone had cut out eight minutes or so of the album right in the middle. The AccuRip indicates the same; the first four tracks are reported as accurately ripped and the rest of them fail entirely.

    ReplyDelete
    Replies
    1. I've found a non-broken version, will replace that file.

      Delete
  7. Hi,

    Recently it was revealed to me in a dream that the world will be ending in approximately two weeks, do you think the next version of TLMC will be released by then? No pressure, just curious.

    Also, is there an estimate on how big the finished torrent will be? I want to make sure there is enough room on my NAS for it.

    Thank you for all your hard work.

    ReplyDelete
    Replies
    1. >in approximately two weeks, do you think the next version of TLMC will be released by then? No pressure, just curious.
      I have three distinct answers:
      - Desired release date, which says "earlier than that".
      - Prediction based on internal feelings/intuition, which says "close, but probably yes".
      - Prediction based on past performance, which says "no way".

      >Also, is there an estimate on how big the finished torrent will be? I want to make sure there is enough room on my NAS for it.
      About 1 extra TB, so about 3 in total. I think I'll skip all mixed sampling-rate/bit-depth albums for now, because I need to decide what are the best tools and how to use them (which resampler, yes/no to dither, yes/no to normalization in 24-to-16, yes/no to clipping detection/mitigation, etc) and it would delay the release even more, even though all of the differences will fairly likely be entirely under detection threshold.

      Delete
  8. Yo,

    I was listening to some Aether&埼玉最終兵器 the other day and I noticed that some of their songs stutter.
    In "2016.08.13 [SHAT-0010] TH MEDLEY -夢幻天奏 弐- [C90]", there are actually two songs (maybe more? I'll continue listening) which stutter:
    "幽霊楽壇~Phantom Ensemble" near 1:18
    "幻視の夜~Ghostly Eyes" near 0:43

    Just want to know if it's just me or not. Currently doing a recheck on the files.
    I could search for another version of the album if you want. I like this circle so I don't mind buying their albums if needed.

    Sorry to inconvenience.
    Have a good day!

    ReplyDelete
    Replies
    1. Not just you. Rather than stutter I'd describe that as dropout, but yes, there is an unnatural silence in those locations.
      As you can see that particular rip got perfect verification, so it's either a really weird stylistic choice (highly unlikely) or most likely a mastering defect.
      By all means buy their album if you like the circle and wish to support them, but it will not get you any better sound, unless they issued a fixed reprint version.

      Delete
    2. Thanks for your reply.

      I definitely plan to buy what I like to listen to. That's one of the goals of TLMC for me, to find new things to listen to.

      As for the dropouts in the songs, glad to know it's not a rip issue. I think I'll fix it in Audacity or something.

      Delete
  9. Here are a couple of fixed albums:
    https://nyaa.si/view/1485792
    https://nyaa.si/view/1485793

    ReplyDelete
  10. I'm not sure if you answered this already, but why are you using TTA instead of FLAC? Is there a technical reason (slightly higher compression) or is it just more convenient (faster) for you? Most people would be happier if they were FLAC.

    ReplyDelete
    Replies
    1. from v19 release faq

      Q: Why TTA and not FLAC?
      A: Two reasons, one historical and one technical.
      Historical one is that a huge chunk of early albums were in tta, so to be consistent about the codec I converted the rest (mostly ape) into tta and then the majority of albums were also shared as tta, so less work for me. The technical one is that I like tta better. Flac is open source and rather popular, it's the most popular lossless format on a lot of metrics, which is a big reason to adopt it. Tta is also opensource and less popular, but not completely unknown, and several of its design decisions (whether intentional or accidental) stand out to me as flat out better. These are: only one compression level and no embedded metadata support*. According to my preferences this outweighs flac's popularity.
      * Mixing immutable data with metadata is a big no-no in my eyes and leads to disappointment and sad kittens. You wouldn't make a kitten sad, would you?

      Delete
    2. >several of its design decisions (whether intentional or accidental) stand out to me as flat out better
      While there's room for disagreement on some technical decisions, the given reasons aren't very sound.

      >only one compression level
      As someone distributing once encoded read-only files, why does this matter on a technical level? The default FLAC compression setting of 5 is default for a reason. It provides the best balance of compression to encoding time for the majority of users. If needed it can be adjusted to specific use cases.

      >no embedded metadata support
      This is incorrect. The TTA website explicitly states on their homepage since at least 2015 it supports ID3v1 and ID3v2.

      >Mixing immutable data with metadata is a big no-no in my eyes
      Trying to prevent users from adding their own embedded metadata to files and breaking the torrent is understandable, but along with whole album files is too inconvenient for most users. As seen by the recent questionable FLAC re-encode (https://nyaa.si/view/1359948) not having the metadata embedded will cause fragmentation in the future. While I actually agree with you that ideally metadata should be stored separate from their files, I think FLAC handles metadata pretty well in the real world. By default flac (and probably libflac, but haven't actually checked) reserves 8kb of metadata space. This is great since changes don't overwrite the whole file. This is particularly great for torrents since only the chunk where the metadata is located needs to be redownloaded instead of the entire file. I think splitting songs into individual files with embedded metadata along with a multi-file cue file per album would be the ideal release format.

      On another note, a hybrid v2 torrent would be greatly appreciated when TLMC v20 is released.

      Delete
    3. I had to enable premoderation because of sudden recent increase in spam, please don't worry if your comments don't show up immediately.

      Delete
    4. To Anonymous @ 26 February, I do have a reply to that, I'll post it later.

      Delete