• 4 Posts
  • 37 Comments
Joined 1 year ago
cake
Cake day: June 10th, 2023

help-circle


  • I believe the optimization came because the denominator was a power of two. In my memory, the function counted up all of the bytes being sent and checked to see that the sum was a power of 16 (I think 16 bytes made a single USB endpoint or something; I still don’t fully understand USB).

    For starters, you can split up a larger modulo into smaller ones:

    X = (A + B); X % n = (A % n + B % n) % n

    So our 16 bit number X can be split into an upper and lower byte:

    X = (X & 0xFF) + (X >> 8)

    so

    X % 16 = ((X & 0xFF) % 16 + (X >>8) % 16) % 16

    This is probably what the compiler was doing in the background anyway, but the real magic came from this neat trick:

    x % 2^n = x & (2^n - 1).

    so

    x % 16 = x & 15

    So a 16 bit modulo just became three bitwise ANDs.

    Edit: and before anybody thinks I’m good a math, I’m pretty sure I found a forum post where someone was solving exactly my problem, and I just copy/pasted it in.

    Edit2: I’m pretty sure I left it here, but I think you can further optimize by just ignoring the upper byte entirely. Again, only because 16 is a power of 2 and works nicely with bitwise arithmatic.














  • ch00f@lemmy.worldOPtoSelfhosted@lemmy.world[Question] Rate my upgrade!
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    4 months ago

    Yeah, but we always run them in native formats, so it’s not a big load on the processor. We only watch the 4K stuff at home where it’s got a hardwired gigabit ethernet connection.

    If you saw my other comment, I’m kind of talking myself out of this upgrade since I managed to get qsv working on my current rig.


  • That shouldn’t be the case. I’d look into getting this fixed properly before spending a ton of money for new hardware that you may not actually need. It smells like to me that encode or decode part aren’t actually being done in hardware here.

    Right you are!

    Dug into it a little more. There were some ffmpeg flags that weren’t being enabled by the latest release of Photoprism. Had to move to the test build. https://github.com/photoprism/photoprism/discussions/4093

    While it’s faster than real time now, Photoprism still won’t start streaming until the preview is fully generated, so longer video clips can take a minute or two to start playing. It only has to happen once per file, but it’s still annoying. There’s a feature to pre-transcode video, but it’s only to get in to a streamable format. It doesn’t check bitrate/size until you actually start to play.

    I might write a script to pre-generate the preview files, but either way, I don’t think I need to upgrade the server quite yet.