Tags

Tags give the ability to mark specific points in history as being important

x86_cleanups_for_6.5

b360cbd2 · x86/acpi: Remove unused extern declaration acpi_copy_wakeup_routine() · Jun 21, 2023

 - Address -Wmissing-prototype warnings
 - Remove repeated 'the' in comments
 - Remove unused current_untag_mask()
 - Document urgent tip branch timing
 - Clean up MSR kernel-doc notation
 - Clean up paravirt_ops doc
 - Update Srivatsa S. Bhat's maintained areas
 - Remove unused extern declaration acpi_copy_wakeup_routine()

x86_tdx_for_6.5

94142c9d · x86/mm: Fix enc_status_change_finish_noop() · Jun 06, 2023

 - Fix a race window where load_unaligned_zeropad() could cause
   a fatal shutdown during TDX private<=>shared conversion
 - Annotate sites where VM "exit reasons" are reused as hypercall
   numbers.

x86_platform_for_6.5

73b3108d · x86/platform/uv: Update UV[23] platform code for SNC · May 31, 2023
```
Add UV platform support for sub-NUMA clustering
```
x86_irq_for_6.5

504dba50 · x86/irq: Add hardcoded hypervisor interrupts to /proc/stat · Jun 08, 2023
```
Add Hyper-V interrupts to /proc/stat
```

x86_cpu_for_v6.5

f220125b · x86/retbleed: Add __x86_return_thunk alignment checks · May 17, 2023

- Compute the purposeful misalignment of zen_untrain_ret automatically
  and assert __x86_return_thunk's alignment so that future changes to
  the symbol macros do not accidentally break them.

- Remove CONFIG_X86_FEATURE_NAMES Kconfig option as its existence is
  pointless

x86_cc_for_v6.5

84b9b44b · virt: sevguest: Add CONFIG_CRYPTO dependency · Jun 09, 2023

- Add support for unaccepted memory as specified in the UEFI spec v2.9.
  The gist of it all is that Intel TDX and AMD SEV-SNP confidential
  computing guests define the notion of accepting memory before using it
  and thus preventing a whole set of attacks against such guests like
  memory replay and the like.

  There are a couple of strategies of how memory should be accepted
  - the current implementation does an on-demand way of accepting.

x86_cache_for_v6.5

e0a6ede2 · Documentation/x86: Documentation for MON group move feature · Jun 07, 2023

- Implement a rename operation in resctrlfs to facilitate handling
  of application containers with dynamically changing task lists

- When reading the tasks file, show the tasks' pid which are only in
  the current namespace as opposed to showing the pids from the init
  namespace too

- Other fixes and improvements

x86_build_for_v6.5

9d9173e9 · x86/build: Avoid relocation information in final vmlinux · Jun 14, 2023

- Remove relocation information from vmlinux as it is not needed by
  other tooling and thus a slimmer binary is generated. This is
  important for distros who have to distribute vmlinux blobs with their
  kernel packages too and that extraneous unnecessary data bloats them
  for no good reason

5.4.248-263

dec42353 · ODROID-XU4: config: add support for Sensiron SHTC1 · Jun 26, 2023
5.4.248-424

dec42353 · ODROID-XU4: config: add support for Sensiron SHTC1 · Jun 26, 2023

x86_alternatives_for_v6.5

2bd4aa93 · x86/alternative: PAUSE is not a NOP · Jun 14, 2023

- Up until now the Fast Short Rep Mov optimizations implied the presence
  of the ERMS CPUID flag. AMD decoupled them with a BIOS setting so decouple
  that dependency in the kernel code too

- Teach the alternatives machinery to handle relocations

- Make debug_alternative accept flags in order to see only that set of
  patching done one is interested in

- Other fixes, cleanups and optimizations to the patching code

ras_core_for_v6.5

4251566e · EDAC/amd64: Cache and use GPU node map · Jun 19, 2023

- Add initial support for RAS hardware found on AMD server GPUs (MI200).
  Those GPUs and CPUs are connected together through the coherent fabric
  and the GPU memory controllers report errors through x86's MCA so EDAC
  needs to support them. The amd64_edac driver supports now HBM (High
  Bandwidth Memory) and thus such heterogeneous memory controller
  systems

- Other small cleanups and improvements

x86-core-2023-06-26

45e34c8a · x86/smp: Put CPUs into INIT on shutdown if possible · Jun 20, 2023

A set of fixes for kexec(), reboot and shutdown issues

 - Ensure that the WBINVD in stop_this_cpu() has been completed before the
   control CPU proceedes.

   stop_this_cpu() is used for kexec(), reboot and shutdown to park the APs
   in a HLT loop.

   The control CPU sends an IPI to the APs and waits for their CPU online bits
   to be cleared. Once they all are marked "offline" it proceeds.

   But stop_this_cpu() clears the CPU online bit before issuing WBINVD,
   which means there is no guarantee that the AP has reached the HLT loop.

   This was reported to cause intermittent reboot/shutdown failures due to
   some dubious interaction with the firmware.

   This is not only a problem of WBINVD. The code to actually "stop" the
   CPU which runs between clearing the online bit and reaching the HLT loop
   can cause large enough delays on its own (think virtualization). That's
   especially dangerous for kexec() as kexec() expects that all APs are in
   a safe state and not executing code while the boot CPU jumps to the new
   kernel. There are more issues vs. kexec() which are addressed separately.

   Cure this by implementing an explicit synchronization point right before
   the AP reaches HLT. This guarantees that the AP has completed the full
   stop proceedure.

 - Fix the condition for WBINVD in stop_this_cpu().

   The WBINVD in stop_this_cpu() is required for ensuring that when
   switching to or from memory encryption no dirty data is left in the
   cache lines which might cause a write back in the wrong more later.

   This checks CPUID directly because the feature bit might have been
   cleared due to a command line option.

   But that CPUID check accesses leaf 0x8000001f::EAX unconditionally. Intel
   CPUs return the content of the highest supported leaf when a non-existing
   leaf is read, while AMD CPUs return all zeros for unsupported leafs.

   So the result of the test on Intel CPUs is lottery and on AMD its just
   correct by chance.

   While harmless it's incorrect and causes the conditional wbinvd() to be
   issued where not required, which caused the above issue to be unearthed.

 - Make kexec() robust against AP code execution

   Ashok observed triple faults when doing kexec() on a system which had
   been booted with "nosmt".

   It turned out that the SMT siblings which had been brought up partially
   are parked in mwait_play_dead() to enable power savings.

   mwait_play_dead() is monitoring the thread flags of the AP's idle task,
   which has been chosen as it's unlikely to be written to.

   But kexec() can overwrite the previous kernel text and data including
   page tables etc. When it overwrites the cache lines monitored by an AP
   that AP resumes execution after the MWAIT on eventually overwritten
   text, stack and page tables, which obviously might end up in a triple
   fault easily.

   Make this more robust in several steps:

    1) Use an explicit per CPU cache line for monitoring.

    2) Write a command to these cache lines to kick APs out of MWAIT before
       proceeding with kexec(), shutdown or reboot.

       The APs confirm the wakeup by writing status back and then enter a
       HLT loop.

    3) If the system uses INIT/INIT/STARTUP for AP bringup, park the APs
       in INIT state.

       HLT is not a guarantee that an AP won't wake up and resume
       execution. HLT is woken up by NMI and SMI. SMI puts the CPU back
       into HLT (+/- firmware bugs), but NMI is delivered to the CPU which
       executes the NMI handler. Same issue as the MWAIT scenario described
       above.

       Sending an INIT/INIT sequence to the APs puts them into wait for
       STARTUP state, which is safe against NMI.

    There is still an issue remaining which can't be fixed: #MCE

    If the AP sits in HLT and receives a broadcast #MCE it will try to
    handle it with the obvious consequences.

    INIT/INIT clears CR4.MCE in the AP which will cause a broadcast #MCE to
    shut down the machine.

    So there is a choice between fire (HLT) and frying pan (INIT). Frying
    pan has been chosen as it's at least preventing the NMI issue.

    On systems which are not using INIT/INIT/STARTUP there is not much
    which can be done right now, but at least the obvious and easy to
    trigger MWAIT issue has been addressed.

x86-boot-2023-06-26

0a9567ac · x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build · Jun 16, 2023

Updates for the x86 boot process:

 - Initialize FPU late.

   Right now FPU is initialized very early during boot. There is no real
   requirement to do so. The only requirement is to have it done before
   alternatives are patched.

   That's done in check_bugs() which does way more than what the function
   name suggests.

   So first rename check_bugs() to arch_cpu_finalize_init() which makes it
   clear what this is about.

   Move the invocation of arch_cpu_finalize_init() earlier in
   start_kernel() as it has to be done before fork_init() which needs to
   know the FPU register buffer size.

   With those prerequisites the FPU initialization can be moved into
   arch_cpu_finalize_init(), which removes it from the early and fragile
   part of the x86 bringup.

timers-core-2023-06-26

d2b32be7 · Merge tag 'timers-v6.5-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/core · Jun 26, 2023

Time, timekeeping and related device driver updates:

 - Core:

   - A set of fixes, cleanups and enhancements to the posix timer code:

     - Prevent another possible live lock scenario in the exit() path,
       which affects POSIX_CPU_TIMERS_TASK_WORK enabled architectures.

     - Fix a loop termination issue which was reported syzcaller/KSAN in
       the posix timer ID allocation code.

       That triggered a deeper look into the posix-timer code which
       unearthed more small issues.

     - Add missing READ/WRITE_ONCE() annotations

     - Fix or remove completely outdated comments

     - Document places which are subtle and completely undocumented.

   - Add missing hrtimer modes to the trace event decoder

   - Small cleanups and enhancements all over the place

 - Drivers:

     - Rework the Hyper-V clocksource and sched clock setup code

     - Remove a deprecated clocksource driver

     - Small fixes and enhancements all over the place

smp-core-2023-06-26

bf5a8c26 · trace,smp: Add tracepoints for scheduling remotelly called functions · Jun 16, 2023

A large update for SMP management:

  - Parallel CPU bringup

    The reason why people are interested in parallel bringup is to shorten
    the (kexec) reboot time of cloud servers to reduce the downtime of the
    VM tenants.

    The current fully serialized bringup does the following per AP:

      1) Prepare callbacks (allocate, intialize, create threads)
      2) Kick the AP alive (e.g. INIT/SIPI on x86)
      3) Wait for the AP to report alive state
      4) Let the AP continue through the atomic bringup
      5) Let the AP run the threaded bringup to full online state

    There are two significant delays:

      #3 The time for an AP to report alive state in start_secondary() on
         x86 has been measured in the range between 350us and 3.5ms
         depending on vendor and CPU type, BIOS microcode size etc.

      #4 The atomic bringup does the microcode update. This has been
         measured to take up to ~8ms on the primary threads depending on
         the microcode patch size to apply.

    On a two socket SKL server with 56 cores (112 threads) the boot CPU
    spends on current mainline about 800ms busy waiting for the APs to come
    up and apply microcode. That's more than 80% of the actual onlining
    procedure.

    This can be reduced significantly by splitting the bringup mechanism
    into two parts:

      1) Run the prepare callbacks and kick the AP alive for each AP which
      	 needs to be brought up.

	 The APs wake up, do their firmware initialization and run the low
      	 level kernel startup code including microcode loading in parallel
      	 up to the first synchronization point. (#1 and #2 above)

      2) Run the rest of the bringup code strictly serialized per CPU
      	 (#3 - #5 above) as it's done today.

	 Parallelizing that stage of the CPU bringup might be possible in
	 theory, but it's questionable whether required surgery would be
	 justified for a pretty small gain.

    If the system is large enough the first AP is already waiting at the
    first synchronization point when the boot CPU finished the wake-up of
    the last AP. That reduces the AP bringup time on that SKL from ~800ms
    to ~80ms, i.e. by a factor ~10x.

    The actual gain varies wildly depending on the system, CPU, microcode
    patch size and other factors. There are some opportunities to reduce
    the overhead further, but that needs some deep surgery in the x86 CPU
    bringup code.

    For now this is only enabled on x86, but the core functionality
    obviously works for all SMP capable architectures.

  - Enhancements for SMP function call tracing so it is possible to locate
    the scheduling and the actual execution points. That allows to measure
    IPI delivery time precisely.

irq-core-2023-06-26

f121ab7f · Merge tag 'irqchip-6.5' of... · Jun 26, 2023

Updates for the interrupt subsystem:

 - Core:

   - Convert the interrupt descriptor storage to a maple tree to overcome
     the limitations of the radixtree + fixed size bitmap. This allows to
     handle real large servers with a huge number of guests without
     imposing a huge memory overhead on everyone.

   - Implement optional retriggering of interrupts which utilize the
     fasteoi handler to work around a GICv3 architecture issue.

 - Drivers:

   - A set of fixes and updates for the Loongson/Loongarch related drivers.

   - Workaound for an ASR8601 integration hickup which ends up with CPU
     numbering which can't be represented in the GIC implementation.

   - The usual set of boring fixes and updates all over the place.

core-debugobjects-2023-06-26

8b64d420 · debugobjects: Recheck debug_objects_enabled before reporting · Jun 07, 2023

A single update for debug objects:

  - Recheck whether debug objects is enabled before reporting a problem to
    avoid spamming the logs with messages which are caused by a concurrent
    OOM.

v6.4

6995e2de · Linux 6.4 · Jun 25, 2023
```
Linux 6.4
```
v6.3.9-danctnix1

232bc523 · DanctNIX kernel v6.3.9-danctnix1 · Jun 25, 2023
```
DanctNIX kernel v6.3.9-danctnix1
```