Page: Previous  1, 2

rooots
Status: Interested
Joined: 17 May 2020
Posts: 43
Reply Quote
Um...two questions: first, how does that relate to my initial issue, second, how did you acquire that log?

Could it be that you posted in the wrong thread?! ;-)
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 115
Reply Quote
Sorry, I meant to post those last 2 replies in another thread.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4128
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
ooots, you appear to be running the new Alder Lake intel cpu, which is certainly going to have issues with in particular the scheduler, this might be of interest to damentz in more ways than he realizes.

Last I saw last week in cpu discussions, linux kernel doesn't have stable support for the new alder lake performance/efficiency core architecture, and sometimes gets confused about which core to assign a process, leading to crashes and failures.

I'm particularly interested in seeing real data from your cpu for current inxi 3.3.09:

You can edit /etc/inxi.conf and set B_ALLOW_UPDATES=true then save it, then run: inxi -U
then inxi -Ca --dbg 39 > cpu-data.txt then use a pastebin like paste.debian.net/ and post the link to that.

note that current inxi doesn't have any of these features or support, and still doesn't have the full fixes for advanced cpu architectures like Alder Lake in place.
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1143
Reply Quote
Wow, I didn't even notice until you pointed that out. Yes, it looks like there's a lot of early adopter issues going on here.

There's a lot to say now that I know this:
1) Liquorix is probably not the best kernel to run on Alder Lake at the moment until there's a solution to properly setting affinity and scheduling appropriate tasks on the P and E cores. There's some work in Project-C to implement preferred CPUs based on some dynamic signal or kernel boot parameter, but it's not going to get the affect you're looking for (especially once all your P-cores are utilized).

2) Mainline is not any better and it doesn't seem to know where to schedule cores either

3) You'll probably eventually need to install some type of daemon that uses the signals from "Intel Thread Director" to set CPU affinity. I have no idea if there's anyone working on this.

4) New chipsets like Z690 will have lots and lots of bugs since it's a brand new chipset with not much shared from the previous generation. Especially since Intel didn't innovate for so long, what you're experiencing is the accrual of technical debt all being "resolved" at the same time. Everything is changed and now everything is also unstable.

Not saying you made a bad purchase, but with all that said, I really don't think Alder Lake is a great CPU to run at Linux at the moment. Maybe 1 year from now it'll be clear if it's the better buy, but classic flat CPU designs where all cores are roughly equal is what Linux is optimized for outside of custom schedulers used by phone soc makers.

With this all said, I don't think there's really much we can do to help you here. Either return the hardware or be prepared to submit many bugzilla issues or participate in LKML to get your hardware more stable.
Back to top
rooots
Status: Interested
Joined: 17 May 2020
Posts: 43
Reply Quote
Thanks for your input. I'm aware of the fact that current kernels do not have proper support for the new architecture and that improved schedulers probably won't even land with 5.16 according to Phoronix.

However, this is a non-issue for me because I disabled my E-cores per default anyway and I'm seeing a pretty decent performance with 8C/16T and OCTVB hitting up to 5.6GHz. Aside from a handful of freezes during initial dialing in of the proper VCore, I did not experience any issues yet and the CPU is running fine even under heavy 16T math loads.

@techAdmin: I will provide that info you asked for - do you want it for E-Cores enabled or any other specific settings?

@damentz: I did not report any kernel bugs yet, so if I got you correctly, bugzilla.kernel.org/ would be a good starting point to report the current issue?


Cheers,
r.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4128
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
rooots, with both, but use pinxi, the development version of inxi, that has more logic running in it for cpu handling than inxi does.



:: Code ::

sudo wget -O /usr/local/bin/pinxi smxi.org/pinxi && sudo chmod +x /usr/local/bin/pinxi

# run these as regular user, not sudo/root:
# with ecores
pinxi -Ca --dbg 39 > cpu-data-pe.txt

# without pcores
pinxi -Ca --dbg 39 > cpu-data-p.txt

# and also:
# with ecores
cat /proc/cpuinfo > cpuinfo-pe
# without ecores
cat /proc/cpuinfo > cpuinfo-p


I'm not sure yet how these cpus show their data to the system, how caches are handled in the data, how they show per core data, etc.

I'm in the middle of fully refactoring the core cpu data logic, specifically because of alder lake and coming zen cpus, but also to fix a variety of subtle failure cases that have existed for a long time.

Thanks
Back to top
rooots
Status: Interested
Joined: 17 May 2020
Posts: 43
Reply Quote
So, here are the outputs. I had to split the PE log because it was larger than the allowed Pastebin file size. Hope that helps, let me know if you need more info.

r.

P only
[url]
paste.debian.net/hidden/ab233ee6/
[/url]

PE
[url]
paste.debian.net/hidden/2aa59142/
paste.debian.net/hidden/62df8331/
[/url]

< Edited by rooots :: Nov 24, 21, 9:57 >

Back to top
rooots
Status: Interested
Joined: 17 May 2020
Posts: 43
Reply Quote
:: damentz wrote ::
Either return the hardware or be prepared to submit many bugzilla issues or participate in LKML to get your hardware more stable.


By the way, I reported the bug upstream and was asked to do a bisect. Never done that before but with E-cores enabled it only took a bit more than 6 hrs for 12 iterations :-) Appears to be a problematic commit to igc, see the mailing list link below if you're interested.

lore.kernel.org/regressions/8119066974f099aa11f08a4dad3653ac0ba32cd6.camel@gmx.de/

Cheers,
r.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4128
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
rooots, looks like my research paid off, the educated guesses I made about how to handle this seem to be nailing it.

I was originally going to hardcode in these values, which would have been insane, and also would be wrong very frequently, so I made it fully dynamic when using the new logic.

Interesting observations with cache, which new pinxi logic handles without any issues, seamlessly:

8 e-cores with L1, L2 cache per core.
8 e-cores with L1 per core, and L2 shared between 4 cores
L3 always shared between all cores.

I was anticipating complicated scenarios that could not be guessed at with any reliiability, so all that guessing logic is gone now, or in process of being gone.

As you can see the core counter is still not implemented, but it will be. That also will not guess anymore, but rely 100% on what data is actually present in the system, dynamically.

I should have the core counter logic done sometime in next few days, depending on free time.

A big chunk is done now, and new features are starting to appear based on the new logic, but I have to be very careful because all the old logic has to keep working, which is a very tricky balance.

Appreciate the data, very helpful.

Once the rest of the logic is done, inxi will get much more granular type data for cpu stuff, but have to take it very carefully one step at a time because it's very old code with tons of hacks added to it over years to handle weird situations, so have to keep all those running as fallbacks in case the new logics fail for any sub sections of the data.

There's some super granular stuff I may or may not look at more closely as well, but that can mostly be added after the main stuff is rewritten.

You'll know it's working when you use pinxi -U then pinxi -Cxxx and you see the right core counts for e and p-core enabled alder lake, though it should also work fine for other cpu types, and arm, which already did different core types with different min/max speeds per cpu chip for some types. inxi had a very crude hack to handle that situation.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4128
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
Note that pinxi 3.3.09-08 is largely feature complete, in theory, and only requires data and testing to confirm, particularly on alder lake.

:: Code ::
pinxi -U && pinxi -CMazy

and if issues:
:: Code ::
pinxi -CMazy --dbg 39 --dbg 8 > cpu-data.txt

then upload to a pastebin service.
Back to top
Display posts from previous:   
Page: Previous  1, 2
All times are GMT - 8 Hours