Hacker Newsnew | past | comments | ask | show | jobs | submit | Orphis's commentslogin

To add to this, you can add in your .gitmodules the name of the branch that the --remote flag will follow. It just works.

While it would help for some use-cases, it wouldn't necessarily reduce the problem that a browser is facing when dealing with malicious code in a large and complex codebase. And vetted people can be victims of supply-chain attacks, which makes it still hard to evaluate a change properly.

It's not an impossible problem, but it's a resource allocation problem, and they don't seem to have a way to address it at the moment besides closing all PRs.


One of the interesting usage of AV1 was specifically for low bitrate calls, and software encoding was perfectly fine, even on mobile.

With low enough resolution, framerate and bitrate, you can get a quality stream without significant encoding artifacts compared to any other codec. It is in production right now and has been for a while.

The tradeoff CPU / bandwidth is quite advantageous in situations like this. And no, AV1 HW encoders cannot usually be used, they are not designed for a tight bitrate control or realtime communications like software encoding is usually.


> One of the interesting usage of AV1 was specifically for low bitrate calls, and software encoding was perfectly fine, even on mobile.

You really want hardware decoding on mobile, otherwise you end up with 40 minutes battery life. Fortunately, for typical videoconference resolutions, VP8 and H.264 are just fine. AV1 is nice to have, though, due to excellent support for synthetic content (screen sharing), and for scalable video coding (a much more elegant solution than simulcast, IMHO).

In the world I live in, the general plan is to stick to VP8 and H.264 for the time being, and to skip to AV1 when it's universally available on mobile. I haven't seen any features of AV2 which would justify waiting for it.


No, you do NOT want hardware anything on mobile if you are targeting smaller bitrate that are not that taxing on the CPU, when the conditions are otherwise so bad that the call would either drop or be unusable. HW encoders produce bad results at low bitrate. HW decoders usually have issues with the temporal encodings used and they may also just not accept those streams (a lot of test scenarios are movies, and the RTC tools are poorly supported).

I worked on shipping it to Chromium, WebRTC and Google Meet many years ago and we had many publications about it: - https://blog.google/products-and-platforms/products/duo/4-ne... - https://webrtchacks.com/the-hidden-av1-gift-in-google-meet/

The use case is not screensharing or a large conference room, but mainly a simpler talking face for a 1:1 chat, but with good quality as packet loss is then not as impactful on a 30KBps stream with AV1 than a 50KBps VP8 stream.


> we had many publications about it

I'd be interested in learning more, but the links you provide are just advertising copy. Could you please provide links to actual technical articles on your conclusions?


The internals are usually confidential and it's hard to find an engineer willing to make a comprehensive write-up about those: they want to make tech and not spend time proofing a tech write-up for public consumption (they already had to make an internal one!).

So the middle ground is that you have those "marketing" copies that demo the tech. One of the telling part of those is how you can get a fine usable 30KBps stream at very low bitrate with AV1 compared to a higher bitrate H264 that is unusable. It doesn't tell you that because you are using a lot less bytes, you will be trading CPU power consumption for radio power consumption and it's a tricky comparison, but in general, it's a favorable trade for the user who has very bad network conditions and is trying to make a call. The goal is to make the call work at all cost, not to save the battery and having a useless stream of data transferred.


> HW encoders produce bad results at low bitrate.

Is that poor implementation or is it inherently harder to implement in hw encoders?


There's a few reasons, I suspect fixed resource depth might be factor in poor hardware single pass encoding ...

What does limit them, though, is pseudo real time single pass pipelines.

I see the best encoding results from two pass - one fast run to work out the easy compress and hard compress parts of a video and then a second pass to get the optimal results on a stream that's already got a budget in mind for each section through the advantage of foresight as to what's left to do.


As someone else said, it's poor single pass encoding performance targeted for the tools used in real-time communications. This type of usage is "new" to hardware manufacturers and they poorly test it as it's easier to make a chip good enough for decoding the general case for watching your favorite movie platform than do something comprehensive.

One aspect of real-time encoding is that the frames are not ordered or structured the same linear way as they used to be in older format. Now, we have temporal and spatial encoding, which allows for better frame drops or efficiency or a stream that is decodable at multiple resolutions at the same time.

An example of temporal encoding is that you have a sequence of frames at 15fps (T0) that are all referencing the previous one, and sometimes an I-frame that is a full independent picture you can start decoding from. Then, you can have another temporal layer (T1) , where for every frame at the base 15 fps layer (T0), you insert a new frame that depends on it. You end up having a 30 fps stream! And if your network connection is worse, or you hardware can't keep up, you can drop the T1 layer and only use the T0 layer. This works great for real-time! In the specs, you could have more layers with more complex dependency chains, but 3 layers is as high as you want to go.

Spatial encoding is a bit different, you will have frame at the highest resolution, but they reference another frame at half the resolution (who may also do the same). Each higher layer means just adding more details over the smaller size frame that you have at the base. To decode an image, you need to have all the frames available. This can also be combined with the temporal encoding above. While this isn't useful for a 1-1 communication, in conference rooms, it's a great optimization as while you may send your full HD picture to the server, you may not want to send that to everyone when you're just a thumbnail who is not actively speaking. So the conference server will not send the full HD picture, but the lower resolution only. And since you don't want to do the encoding on the server (it's expensive, slow and you need to trust the intermediate service with your secret stuff), doing spatial encoding on the client side is better.

Those techniques are all advanced ones that would be used if available universally. Unfortunately, a lot of hardware decoders choke on those, despite being part of the specs. And it's not that they can't generate a stream with those, they also sometimes can't decode them (breaking the spec).

And finally, the hardware encoders are tuned for higher bitrate work. Ask them to do a 3MBps stream, they'll do fine. Ask them for a 30KBps stream, they'll make garbage most of the time.


Have you said this for Audio Codec I would have agreed. I do not know a single Smartphone Video Conferencing software that uses CPU encoding rather than hardware encoding. Neither WhatsApp or FaceTime, perhaps the largest of the two real time Video Call uses AV1.


Yeah, no production or large scale VC system is running software AV1 encoders on smartphones. You will drain a full phone battery in 1-2 hours of calls.

It just doesn’t make sense and will result in extraordinary power/battery drainage at best, or output that’s worse than hardware encoding.

The only way you could get AV1 to software encode in realtime AND low latency on a mid-range Android chip is by disabling or skipping nearly all of the compression/encoding features that make it good at low bitrate.


> Yeah, no production or large scale VC system is running software AV1 encoders on smartphones. You will drain a full phone battery in 1-2 hours of calls.

Yeah but, half jokingly, Zoom does that (draining the battery in an hour) already :P


So, status remains quo, the commons remain tragic, and glory to H.264 forever?


> tragic

H.264 isn't even that bad at all, if not the best depending on how you look at it. Our Internet bandwidth, both on the backend and front end on Mobile 5G is increasing with plenty more room to grow. While computation decoding and storage isn't.

i.e If bandwidth is infinite and free, and we are only optimising for decoding power usage. H.264 wins in a lot of this scenario.


H264 is lacking a lot of features (behind patents) that are essential to real-time communications. It's available, but by far, the worst offender for call quality. Modern call technology will want to use temporal and spatial scaling which are not available in the profiles supported by most H264 encoders and decoders.

Those tools are available for VP8 (temporal only), VP9 and AV1 and improve the quality of calls quite a lot when used right. I don't know about about the internals of H265 and H266 as those are also behind patents and no one wanted to touch them in the real-time conferencing space.


At least until a better codec has widespread enough hardware support, I think.


Google Meet can do it. You don't want the full conference with AV1, just use it for very low bitrate scenarios with a high packet loss possibly. Phones are a good target system. And I know this is quite opposite to expectations.

It is a lot better to send a stable and visually ok stream with AV1 at 30KBps than fail to send a VP8 50KBps stream that is unusable anyway and is subject to twice as many packet lost than a lower bitrate solution.

It is possible they use AV1 in other scenarios now, but I left the team a while back now and I haven't checked what they are now using under the hood.


Canal+ had a few animes not suited for kids and a few others that didn't really fit the catalog from TF1 or TMC (which was mostly available south of France). Those 2 had volume, Canal+ had more "quality" ones.

I remember watching Akira, some DBZ movies, Evangelion, Vision of Escaflowne, Armitage III and many others!


Zstd is used in a lot of places now. Lots of servers and browsers support it as it is usually faster and more efficient than other compression standards. And some Linux distributions have packages, or even the kernel that can be compressed with it too, which is preferred in some situations where decompression speed matters more than storage cost.


There are also alternatives that can be good enough, such as the Swedish BankId system, which is managed by a private company owned by many banks. They provide authentication and a chain of trust for the great majority of the population on about all websites (government, healthcare, banking and other commercial services) and is also used to validate online payments (3D Secure will launch the BankId app).

While it's not without faults (services do not always support alternative authentication which may support foreigners having the right to live in the country), it has been quite reliable for so many years.

So just to say, you can have successful alternatives to a government controlled system as many actors may decide it is quite valuable to develop and maintain such a system and that it aligns with their interest, and then have it become a de-facto standard.


How does that prevent the ID service from discovering which services you use it for?


You don't "get into the sandbox", if a cheat program opted in, they would be launched into a separate instance that's distinct from the game.

And you would sign your files, which get verified by the integrity platform and allow you to authenticate with the servers securely.


Sounds very similar to total platform lockdown


It is similar except it's only a total lockdown of the sandbox.


In some cases, you can start by using the "at" functions (openat...) to work on a directory tree. If you have your logical "locking" done at the top-level of the tree, it might be a fine option.

In some other cases, I've used a pattern where I used a symlink to folders. The symlink is created, resolved or updated atomically, and all I need is eventual consistency.

That last case was to manage several APT repository indices. The indices were constantly updated to publish new testing or unstable releases of software and machines in the fleet were regularly fetching the repository index. The APT protocol and structure being a bit "dumb" (for better or worse) requires you to fetch files (many of them) in the reverse order they are created, which leads to obvious issues like the signature is updated only after the list of files is updated, or the list of files is created only after the list of packages is created.

Long story short, each update would create a new folder that's consistent, and a symlink points to the last created folder (to atomically replace the folder as it was not possible to swap them), and a small HTTP server would initiate a server side session when the first file is fetched and only return files from the same index list, and everything is eventually consistent, and we never get APT complaining about having signature or hash mismatches. The pivotal component was indeed the atomicity of having a symlink to deal with it, as the Java implementation didn't have access to a more modern "openat" syscall, relative to a specific folder.


As someone who worked on Meet at Google, it seems that it could have been networking to the datacenters where the call is routed from, some issues with UDP comms on your network which triggered a bad fallback to WebRTC over TCP. Could also have been issues with the browser version you used.

Since Teams is using the very old H264 codec and Meet is using VP8 or VP9 depending on the context, it's possible you also had some other issues with bad decoding (usually done in software, but occasionally by the hardware).

Overall, it shouldn't be representative of the experience on Meet that I've seen, even from all the bug reports I've read.


Google is made of many thousands of individuals. Some experts will be aware of all those, some won't. In my team, many didn't know about those details as they were handled by other builds teams for specific products or entire domains at once.

But since each product in some different domains had to actively enable those optimizations for themselves, they were occasionally forgotten, and I found a few in the app I worked for (but not directly on).


ICF seems like a good one to keep in the box of flags people don't know about because like everything in life it's a tradeoff and keeping that one problematic artifact under 2GiB is pretty much the only non-debatable use case for it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: