I'm talking about the newcoming processors that integrate more and more coprocessor features and vector instructions as extensions (grouping ranges of registers by pairs to create larger registers, and with dedicated ALU/FPU units that are now increasingly parallized with multiple processing channels. There's a trend now, with AI units, GPUs, and as well a need for new kinds of applications that will process huge amounts of data containing tiny bits of information that can only be detected within a large amount of noise, currently hidden/masked by the limited precision.
Apple, Google are already selling their new processors; even if they are based on a 64-bit architecture, they contain the vector extensions and 128 bit extensions We'll see more and more use of these types because this is not for common desktop or web applications we used until now (including web apps for mobiles), as they are massively interconnected over a faster and larger network. And the number of devices with processing capabilities explodes; The industry finds new applications everyday: we are no longer at the time of experiments of specialized applications for specific domains by specific installations. We find them now everywhere in all sorts of objects (IOTs are now there for long, may be their local processing is limited but they work within a very capable and very wide network that provides and largely extends their services and usability by more nad more users). Even a basic users could still use them with small basic Lua scripts that will integrate into the compound system and will want to use the same datatype (it will not necessarily be slow)