Hi On Tue, Jan 23, 2007 at 12:22:26AM +0200, Siarhei Siamashka wrote: > Hello All, > > First some background information and rationale. > > Nokia 770 [1] graphics chip has support for packed YUV422 color format > (IMGFMT_YUY2 according to ffmpeg classification) but does not support > scaling (except for pixel doubling feature which can scale image exactly > twice). So fullscreen video playback suffers a severe performance penalty > if it needs scaling. And I got some information that PXA270 in latest Sharp > Zaurus PDA [2] also doesn't have hardware scaling capabilities, but do > support YUV colorspace (which formats exactly are supported still needs to > be clarified). So developing a fast ARM optimized scaler for these and similar > devices makes sense. > > A natural solution for getting good scaler performance is to use JIT style > dynamic code generation. I spent full two days on the last weekend and got > some initial scaler implementation working (it is quite simple and > straightforward and uses less than 300 lines of code): > https://garage.maemo.org/plugins/scmsvn/viewcvs.php/trunk/libswscale_nokia770/?root=mplayer > > Its API is quite similar to libswscale, but a bit simplified. You need to > initialize scaler context by providing source and destination resolution, > and also quality level setting. Code for scaling of a horizontal line of > pixels is dynamically generated on this stage. Once context is initialized, > it can be used to scale planar YUV image data and get results in YUY2 > format. > > Horizontal scaler works in the following way: each pixel in the destination > buffer is either a copy of some pixel in the source buffer or an average value > (1:1 proportion) of two nearest pixels. It is possible to extend scaling > precision to add averaging proportions 1:3 and 3:1 with almost no > overhead. Vertical scaling now just maps some source buffer line to each > destination buffer line, but it can be probably extended to add support for > 1:1 proportion averaging of two neighbour source pixel lines to get > destination buffer line. > > So depending on quality setting, we get either nearest neighbour scaler or > some kind of simplified low precision bilinear scaler. In order to estimate > performance, I did some benchmarks with mplayer_1.0rc1-maemo.8 [3] which > aready has this JIT code in use. > > # mplayer -nosound -benchmark -quiet -endpos 100 [scaler_settings] video.avi > > *** -sws 4 *** > SwScaler: Nearest Neighbor / POINT scaler, from yuv420p to yuyv422 using C > SwScaler: using C scaler for horizontal scaling > SwScaler: using n-tap C scaler for vertical scaling (BGR) > SwScaler: 640x272 -> 400x170 > > BENCHMARKs: VC: 62.645s VO: 58.738s A: 0.000s Sys: 1.053s = 122.435s > BENCHMARK%: VC: 51.1654% VO: 47.9746% A: 0.0000% Sys: 0.8599% = 100.0000% > > *** -sws 1 *** > SwScaler: BILINEAR scaler, from yuv420p to yuyv422 using C > SwScaler: using C scaler for horizontal scaling > SwScaler: using n-tap C scaler for vertical scaling (BGR) > SwScaler: 640x272 -> 400x170 > > BENCHMARKs: VC: 64.029s VO: 164.350s A: 0.000s Sys: 1.321s = 229.700s > BENCHMARK%: VC: 27.8750% VO: 71.5500% A: 0.0000% Sys: 0.5750% = 100.0000% > > *** JIT scaler, quality = 1 (nearest neighbour) *** > [nokia770] Using ARM JIT scaler (quality=1) to scale 640x272 => 400x170 > > BENCHMARKs: VC: 63.033s VO: 5.585s A: 0.000s Sys: 0.940s = 69.559s > BENCHMARK%: VC: 90.6193% VO: 8.0295% A: 0.0000% Sys: 1.3512% = 100.0000% > > *** JIT scaler, quality = 2 (use pixel copy or 1:1 proportion averaging for > horizontal scaling, nearest neighbour for vertical scaling) *** > [nokia770] Using ARM JIT scaler (quality=2) to scale 640x272 => 400x170 > > BENCHMARKs: VC: 62.893s VO: 7.551s A: 0.000s Sys: 1.000s = 71.444s > BENCHMARK%: VC: 88.0310% VO: 10.5686% A: 0.0000% Sys: 1.4004% = 100.0000% > > So performance improvement over standard libswscale scalers (first two runs) > is really huge. JIT scaler with quality setting 1 and nearest neighbour scaler > >from libswscale are direct competitors here and JIT scaler implementation is > 10x faster :) > > Using JIT scaler quality 2 settings, I can see some 'sparkles' in the image on > vertical panning scenes, but horizontal panning looks ok. So I expect a > good quality after improving vertical scaling by adding lines averaging. > > > Now I wonder if it would be a good idea to include this JIT scaler for ARM > into ffmpeg and what are the requirements for that? Of course I will clean up > this code first, add more sanity checks and comments (most likely on next > weekend). But I'm more worried about integration into libswscale code without > turning it into a mess. no, there wont be any mess, see swscale.c around line 2047 there are plenty of special case converters, just add yours there too (iam assuming that your converter does YV12 -> YUV422 + vertical + horizontal scaling if you want that your converter is also used just as a horizintal scaler together with the existing code for vertical scaling and colorspace conversation then see hyscale & hcscale in swscale_template.c but please dont follow the bad example there of just dumping the asm in there ... > 1. Is there any documentation about internal libswscale structure and > some hacking guidelines? no, but such docs would be welcome as a patch :) > 2. I see that scalers from libswscale have to support slices. Is it the only > extra requirement or I should be aware of something else? negative stride maybe, but that shouldnt be a problem i guess > 3. What would be the best mapping of the scaling methods used in this JIT > scaler code to libswscale scaling algorithms (nearest neighbour is clear, but > I'm not sure about the rest). SWS_FAST_BILINEAR for inaccurate bilinear, its currently used for our x86 JIT scaler so this seems like a good choice SWS_BILINEAR for (completely) correct bilinear scaling and we can add more SWS_* if needed ... [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Everything should be made as simple as possible, but not simpler. -- Albert Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070123/a0cc27dc/attachment.pgp>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4