• dzaima
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    3 months ago

    There appears to be dedicated silicon for e.g. ADD AH, BL, see uops.info showing it having 1 uop across multiple microarchitectures (e.g. 1*p0156 being notation that it takes one uop on any port between 0/1/5/6, i.e. theoretical throughput of 4 instrs/cycle; I think the displayed 0.4 is just an artifact of it only testing 3 different destination registers despite there being a dependency on it). The newer Alder Lake actually has less throughput, but still takes only one uop.