Recommendations, laptop with touch screen, ubuntu + rhasspy

Hi folks

I’m under the gun for a migration that needs to work (or nearly) out of the box (ha).

So, if anyone has a reasonably recent laptop with the following characteristics:

  • touch screen
  • under ubuntu or derivative (Mint, Kubuntu etc.)
  • integrated camera and mic
    that has worked without problem with a recent Rhasspy release, I’d be really grateful for recommendations.

I have an old thinkpad from the university and it’s having a tough time with the microphone. That’ll probably get sorted, but I’ve prefer not to deal with something like that right now. It’s for this: Home · hbarnard/mema Wiki · GitHub a university project for older people to speak and store their memories and stories, so a fairly non-commercial cause!

Its a strange request Hugh as its sort of in the name Rhasspy is aimed at raspberry level hardware so practically any laptop has a lot more power than is needed.
There is very little that will not run Linux native due to esoteric drivers so your talking a laptop with a touchscreen.

For the 1st time ever I would suggest an Apple product as not a laptop but the the new m1 based Ipad air as the ML perf is truly amazing where you can use SOTA apps like whisper and make Rhasspy look like a bad toy. Also think its 5 mic and touchscreen is it £699? Same chip as the Macbook pro I think and the benches it has are amazing.

Otherwise its any ebay latop with a touchscreen but the touchscreen could be one of those esoteric driver issues but likely if a main brand like hp or dell you could prob with a quick google find if anyone is screaming about Linux drivers and check the download list but often if they are not there it doesn’t mean they are missing its they are already kernel drivers…

Thanks, in fact, after discussion we’ve removed the laptop touchscreen ‘issue’ from the table (!). I’ve compiled whisper.cpp on the Thinkpad and it works pretty well. Terrible on the Pi4, but that’s fair enough. My current way around on the Pi4 is to put transcription into a background cron job.

I agree about Apple, though not my favourites either. As I understand from the whisper.cpp repository, it’s optimised for Apple linear algebra libraries. We’ll have to see how all this evolves.

There is another Hugh as the Radxa Rock5b or OrangePi5 are RK3588/RK3588S chips which are Cortex A76/A55 8 core and Arm v8.2 that also seem to benefit.
The A76 got added SDOT and UDOT instructions provide access to many multiply and accumulate operations every cycle so greatly speed up ML.
The Mac M1 is very much just Arm but a much newer and more powerful one than a Pi, they have added further ML instructions themselves , but like the Pi with A72 cores the equivalent but 4 generations up of the A76 is just much faster with the improvements Arm made.
The project is optimised for Arm now that Apple have an Arm base, but newer chips.

Its not a M1 Mac Mini but its is near x5 ML perf on the CPU vs Pi4 starting at approx £80 delivered, but also the GPU can use ArmNN that is almost as powerful as the CPU and to top that of it also has a 3 core 2Tops NPU.

How many mics does the thinkpad have as might be able to assist?
Whisper is great using the Medium model which really realtime is M1 territory but the streaming mode is not that great as the model is designed with a 30 sec beamsearch (is it 10 sec my memory lols).

The small model isn’t bad as Whisper generally like ChatGPT is a step ahead and likely you could do a bettter streaming version maybe using pyannote/voice-activity-detection · Hugging Face pyannote-audio/voice_activity_detection.ipynb at develop · pyannote/pyannote-audio · GitHub where you aim to fill the beamsearch with the nearest fitting 30/10 sec clips and just drop into a folderwatcher using inotify.
Aswell as performance the fuller beamserach can aid accuracy as much about Whisper is like ChatGPT and how it makes context from a sentence.

The models are quite large small will fit into 4gb but medium really could do with 8gb as it will just spill over on 4gb and start using swap likely zram if you have it setup.

OkDo only seem to do the 8gb/16gb Radxa Rock5b that has x2 m.2 and supports full Pcie3.0 x4 Nvme.

The OrangePi5 I got was £80 delivered but also 4gb, but that is an Aliexpress.

Also if you append those 2 lines with -march=native -ffast-math (insert) you may find quite a perf boost as it did on the RK3588

make clean and make again.

Built with ./main -m models/ggml-small.en.bin -f samples/jfk.wav on a OrangePi5 4gb

CFLAGS   = -I.              -O3 -std=c11   -march=native -ffast-math -fPIC
CXXFLAGS = -I. -I./examples -O3 -std=c++11 -march=native -ffast-math -fPIC
main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:08.000]   And so, my fellow Americans, ask not what your country can do for you.
[00:00:08.000 --> 00:00:11.000]   Ask what you can do for your country.


whisper_print_timings:     load time =   617.37 ms
whisper_print_timings:      mel time =   116.57 ms
whisper_print_timings:   sample time =     8.11 ms
whisper_print_timings:   encode time = 10194.69 ms / 849.56 ms per layer
whisper_print_timings:   decode time =  2003.76 ms / 166.98 ms per layer
whisper_print_timings:    total time = 12941.53 ms

The full beamsearch of the OpenAI install on the OrangePi5

(venv) orangepi@orangepi5:~/whisper$ time python whisper.py
/home/orangepi/whisper/whisper/transcribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
 And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.

real    0m29.145s
user    2m44.884s
sys     0m6.229s

Hi all, thanks for this, I’m now handing most of it over to the uni, partly because of the current wrangle about Horizon funding. I bought an Orange Pi5, it’s in the box at the moment, so all this will be v. useful, I’m helping them choose something with a little more punch (and availability!) than the Pi4. So more in a while. For fun, here’s the knocked together prototype from November. The social science people like the little ‘ears’.
IMAG1104-small

OPi5 is great value and punches so far above a Pi4.

I managed to pick up a Pixel6a refurb for £200 and the offline transcription is pretty amazing, there is also a whisper app but little more than a demo, but is super quick.