应用于横线的抗锯齿算法

摘自：http://sourceforge.net/p/vector-agg/mailman/vector-agg-general/?viewmonth=200308

First of all, let me thank you for the suggestions
perfectly explained.

The other thing is that the "rectangle" optimization
doesn‘t make much sense in comparison with "hline"
one. It complicates things a lot with miserable
result. In fact, the low-level rectangle() 
uses hline().

I agree with you completely that solid polygons can be
optimized and it can be done in a way you described. 

But first let me tell you about the rasterization
algorithm. Honestly, I don‘t understand completely the
math (calculating part, outline_aa::render_scanline)
and I guess we‘ll have to ask David Turner to explain
it if we need. 

The algorithm consists of three phases - decomposing
the path into square cells (pixels), sorting the
cells, and "sweeping" the scanlines.

When you call rasterizer.move_to() it accumulates
cells (pixels) that are crossed by the path. It also
calculates the coverage value for each cell (the area
of the cell that the polygon covers). After it‘s done
all the cells are sorted by Y and then by X with a
modified quick-sort (actually, I sort pointers to the
cells). Finally, method render() "sweeps" the sorted
cells and converts them into a number of scanlines. It
sums all the cells with the same coordinates in order
to calculate right cover value (it happens when the
pixel is crossed more than once by different edges of
the path). It guarantees correct cover values even if
a complex polygon is so small that it falls into one
pixel. This is the main advantage of the algorithm
because it allows you to render very thin objects
correctly (simulating very thin lines with proper
fading).

Function render() creates a scanline that consists of
a number of horizontal spans and then calls
renderer::render() that actually draws the scanline. 

I see two major possibilities to optimize rendering.

1. Using agg::scanline_p32 instead of
agg::scanline_u8. Here ‘p‘ and ‘u‘ refer to ‘packed‘
and ‘unpacked‘. Packed means that all the cells in the
scanline that have equal cover value are represented
as a horizontal line with x,y,length, and cover.
‘Unpacked‘ means that every cell in the scanline has
its own cover value even if they are all the same. So,
the straight way is to use agg::scanline_p32 and to
optimize renderer_scanline::render(). There‘re two
notes. First, it makes sense if the area of the
objects is large enough, that is, rendering small
glyphs is more efficient with using agg::scanline_u8
because of less branched code. Second, it works only
for solid color filling. For gradients, images,
Gouraud we‘ll have to process the scanline
pixel-by-pixel anyway. From this point of view using
agg::scanline_p32 doesn‘t make much sense either.
Still, solid filling is a very common operation and
it‘d be very good to optimize it. We‘ll have the best
result when the color is opaque. But we can optimize
translucent colors too! Here we can use the fact of
the relative coherence of the background image. For
solid span we calculate (alpha-blend) the first pixel
and then check if the color of the next pixel is the
same we simply put previously calculated value.

It‘s all very good, but it doesn‘t help much to speed
up drawing thin strokes. In this case the distribution
of spent time is quite different. The most time
consuming operation in this case is qick_sort. You can
ultra-optimize the scanline rendering but you won‘t
achieve much because scanlines in this case are very
short and they don‘t play any considerable role.

There‘s the second posibility of optimization.

2. We cannot get rid of rasterizing and calculating
cells (outline_aa::render_line) but we can play with
quick-sort. Namely, we don‘t have to sort the whole
path but only each scanline. If the sorting algorithm
had exactly linear complexity it would‘t make sense.
But it‘s faster to sort 100 arrays of 10 elements each
than one array of 1000 elements. In case of
rasterizing a thin line we usually have 2, 3, or 4
cells in the scanline. Simple insertion sort works
faster than the quick-sort in these cases. It looks
rather promising but it requires a kind of smart
memory managing (it requires reallocs and I wouldn‘t
rely on that they are fast and painless).

agg::scanline_p32 is not finished yet. ‘32‘ refers to
the maximal capacity of the coverage value - 32 bits,
but I‘d add one more template argument in order to use
8-bit values.

McSeem

--- eric jones <[email protected]> wrote:
> Hey Group,
> 
> Thinking out loud and long winded...
> 
> I‘ve just started exploring the process for building
> an optimized
> renderer for general (anti-aliased, thick, etc.)
> veritical/horizontal
> lines and rectangles.  Since I haven‘t poked around
> much in this part of
> agg, it is all high-level.  Forgive me if this all
> falls under the
> category of "obvious," but I have not done much with
> low level graphic
> algorithms before.  I am just trying to figure out
> what the important
> abstractions are and hopefully get some ideas about
> where to plug such
> ideas into agg.

> The easiest place to start looking is at drawing a
> single, solid ( i.e:
> non-dashed but potentially semi-transparent),
> vertical/horizontal line
> with arbitrary thickness and end-caps.  The line can
> be broken into
> three basic regions - the two end regions and the
> middle region.  For a
> horizontal line, the regions can be labeled as so:
> 
> 		End1   Middle   End2
> 
> Lets look at the middle region of pixels first
> because it is the
> simplest to render.  For example, if our line
> horizontal line is 5
> pixels (scanlines) thick, the "cover" value for all
> the pixels on a
> single scanline will be the same because the
> antialiasing algorithm
> would return the same alpha value for all of these
> pixels (I hope I am
> using the cover term correctly here).  For example,
> the 5 scanlines of
> the middle region might have alpha values (assume
> 0-9 as full alpha
> range) of 1, 5, 9, 5, and 1 respectively resulting
> in the following
> alpha mask for the middle region‘s applied to the
> line‘s color value.
> 
>             111111111111111111111111111111111111
>             555555555555555555555555555555555555
>             999999999999999999999999999999999999
>             555555555555555555555555555555555555
>             111111111111111111111111111111111111
>

> Based on this, we only have to calculate alpha once
> for the line and
> then call the hline() method (with the alpha
> blending patch) 5 times -
> once for each scanline.  I‘m guessing this would be
> a decent speed win
> over the current algorithm.  Is this a correct
> assumption McSeem?  Also,
> it makes hline() and vline() great candidates for
> platform dependent
> optimization in the future (SSE2, the Intel Image
> library, or whatever)
> because making them fast would speed up a large
> amount of the general
> cases.
> 
> As for the end regions, they need a "complex"
> anti-aliasing algorithm
> applied to them where the "cover" value each pixel
> value is treated
> independently.  This is similar to the current
> rendering approach, but
> we can‘t just treat the end-caps as polygons and
> feed them into the
> current path render because antialiasing is applied
> from one side (left
> on End1, right on End2).  McSeem, is this right or
> is there some way for
> the current path renderer to handle this?
> 
> Here are the alpha values of my (fictitious) width=5
> line with rounded
> end-caps broken out by the region in which they are
> rendered.

> 
>    	  End1                Middle			  End2
>              111111111111111111111111111111111111
>          123 555555555555555555555555555555555555
> 321
>         1257 999999999999999999999999999999999999
> 7521
>          123 555555555555555555555555555555555555
> 321
>              111111111111111111111111111111111111
> 
> 
> Hmmm.  I guess, with thicker lines, there would
> really be another region
> of interest:
> 
> 			  Top-Middle
> 		End1	Center-Middle     End2
> 			Bottom-Middle
> 
> Here, the ends are rendered the same way as before. 
> The Top-Middle and
> Bottom-Middle are regions would be the anti-aliasing
> "blend-in" regions
> of the line and rendered as previously discussed for
> the "Middle"
> region.  The Center-Middle section would be the area
> of the line that
> has a constant alpha cover of 9 and could be filled
> with a call to the
> rectangle() primitive.  So, breaking out the Top,
> Center and Bottom
> regions, assuming a new line of with 10, we would
> have something like:
>

>             111111111111111111111111111111111111  
> Top-Middle
>             555555555555555555555555555555555555
> 
>             999999999999999999999999999999999999
>             999999999999999999999999999999999999
>             999999999999999999999999999999999999  
> Center-Middle
>             999999999999999999999999999999999999
>             999999999999999999999999999999999999
>             999999999999999999999999999999999999
> 
>             555555555555555555555555555555555555  
> Bottom-Middle
>             111111111111111111111111111111111111
> 
> So, I guess this all can be generalize by saying
> there are three major
> types of regions for anti-aliased rendering of any
> type of object be it
> a thick line, a rectangle, or an arbitrary path:
> 
> 	1. Quickly varying areas where the alpha is
> calculated for each
> pixel.
> 	2. Slowly varying areas where alpha is calculated
> for an entire
> row 
>          (vertical or horizontal) at a time.
>       3. Constant areas where the alpha doesn‘t
> change.  This could be,
> but
>          doesn‘t have to be a rectangular region.
>

> It happens that it is fairly simple to break
> horizontal/vertical lines
> and rectangles into these regions.  The vertical
> line is the same as the
> horizontal if we exchange "scan-column" for
> "scanline."  As for a
> rectangle, we have to deal with the joins in the
> corners.  Its regions
> would break down as follows:
> 
> 		TL-Corner	   Top-Middle   TR-Corner
> 		Left-Middle  Center-Middle  Right Middle
> 		BL-Corner	 Bottom-Middle  BR-Corner
> 
> Here, the Corners are all "quickly varying," the Top
> and Bottom Middle
> are "slowly varying" using calls to the hline()
> primitive, the Left and
> Right Middle are "slowly varying" using calls to the
> vline()
> primitive(), and the Center-Middle is, again,
> constant.
> 
> I‘m most interested in the cases that are described
> above, but it occurs
> to me that it is possible to decompose arbitrary
> paths up into these
> three types of regions prior to calling a renderer. 
> This
> domain-decomposition might be so expensive that it
> swamps any benefits
> in some cases -- I am not experienced with such
> algorithms.  I would
> think that there is some way to do it, perhaps on a
> per scanline basis
> instead of on the entire path, that would provide a
> speed improvement.  
> 
> Back to horz/vert lines and rectangles.  I still
> need to handle dashed
> lines.  I‘m guess the way to do this is pass this
> through

时间： 2024-10-17 22:41:27

应用于横线的抗锯齿算法的相关文章

SSE图像算法优化系列二十四: 基于形态学的图像后期抗锯齿算法--MLAA优化研究。

偶尔看到这样的一个算法,觉得还是蛮有意思的,花了将近10天多的时间研究了下相关代码. 以下为百度的结果:MLAA全称Morphological Antialiasing,意为形态抗锯齿是AMD推出的完全基于CPU处理的抗锯齿解决方案.对于游戏厂商使用的MSAA抗锯齿技术不同,Intel最新推出的MLAA将跨越边缘像素的前景和背景色进行混合,用第2种颜色来填充该像素,从而更有效地改进图像边缘的变现效果,这就是MLAA技术. 其实就是这个是由Intel的工程师先于2009年提出的技术,但是由AMD将

抗锯齿（后期效果） Antialiasing (PostEffect)

The Antialiasing (PostEffect) offers a set of algorithms designed to give a smoother appearance to graphics. When two areas of different colour adjoin in an image, the shape of the pixels can form a very distinctive "staircase" along the b

（转）Android中实现区域平均算法在图片缩放里的应用（缩放图片抗锯齿）

摘要:Android图片缩放效果较差,尤其是将大尺寸的图片缩放成小尺寸的图片时,即便是加了抗锯齿,锯齿现象也比较严重:而java sdk里的区域平均算法缩放图片,效果就比较完美了,因为jdk不能直接用于安卓项目中(类冲突),也没找到可以使用的替代的library,最终只好自己写,在此分享! 正文: 目前我知道的Android API中的传统的图片抗锯齿优化处理无非就是以下相关的设置: //缩放抗锯齿Bitmap.createScaledBitmap(bitmap, width, height,

OpenGL核心技术之抗锯齿

笔者介绍:姜雪伟,IT公司技术合伙人,IT高级讲师,CSDN社区专家,特邀编辑,畅销书作者,国家专利发明人;已出版书籍:<手把手教你架构3D游戏引擎>电子工业出版社和<Unity3D实战核心技术详解>电子工业出版社等. CSDN视频网址:http://edu.csdn.net/lecturer/144 抗锯齿问题在游戏中一直存在的,尤其是体现在3D模型上的材质或者游戏UI界面上,由于现在引擎都非常完善,并且引擎都提供了抗锯齿功能,我们通过引擎提供的参数界面设置一下就可以消除.但是很

回击MLAA：NVIDIA FXAA抗锯齿性能实测、画质对比

PC游戏玩家肯定会对各式各样的AA抗锯齿技术非常熟悉,而今天本文的主角就是NVIDIA今年才推出的新型抗锯齿技术"FXAA". FXAA在某种程度上有些类似于AMD之前宣传的MLAA(形态抗锯齿),但远比后者低调,所以很多玩家可能还从来没听说过,但是如果你玩过<永远的毁灭公爵>或者<F.3.A.R>,应该会有所耳闻.今天我们就来实际测测多款显卡上的FXAA性能和画质表现,并将其与MLAA进行简单对比. 什么是FXAA? FXAA全称为"Fast App

抗锯齿技术简介

虽然3D图形渲染技术的飞速进步给我们带来了一次次的视觉震撼,电影级的画面早已经不是遥远的梦想,但电脑在计算3D画面时所出现的锯齿效果仍是不可避免的,这种效果在物体边缘最为明显.画面上那些跳牙咧嘴的锯齿每每令我们如蟹在喉,不吐不快.为了消除这些碍眼的锯齿,抗锯齿技术应运而生,并在图形处理技术不断发展的推动下日趋成熟和完善. 1. FSAA抗锯齿技术最早的抗锯齿技术来自于3DFX,名为FSAA (FuliSceneAnti-aliasing,全屏抗锯齿).我们都知道,画面的分辨率越高,锯齿感就越不

基于图片的抗锯齿方法（一）

目前为止,MSAA仍是抗锯齿效果的黄金标准.然而MSAA需要硬件支持,并且要在RT中存放子像素信息,这大大增加了内存和带宽开销.在使用HDR管线或者G Buffer时此问题显得更加严重. 由于这些限制,基于后处理的抗锯齿方案逐渐成为主流.这类方案并不需要改变渲染管线,而是在图片中寻找被人眼识别为锯齿的像素,再对应模糊处理.morphological antialiasing(MLAA)即是其中之一. MLAA的思路很简单,考虑锯齿图中的一个微元,它常常是下图中B的样子.而如果分辨率无限加大,可以

【转载】浅谈抗锯齿技术-老文章(供参考)

原文:http://vga.zol.com.cn/2002/1007/48701.shtml 一代又一代的图形芯片和显卡不断的推出,PC图形子系统的图形处理能力也随之大幅度的提高,这使得我们有可能在计算机上看到更精美的实时生成的图像.无论图形芯片如何改进,在图形输出技术没有革命性变化的今天,我们看到的最终图像依然是由上百万个显示屏上的像素组成的.正是因为像素的存在,使得图像总是存在一个近乎于无法完全克服的缺点:锯齿. 在现实世界中相邻的两个物体边缘一般是光滑的,但是在电脑上生成的图像中相邻的物体

Unity3d 超级采样抗锯齿 Super Sampling Anti-Aliasing

Super Sampling Anti-Aliasing SSAA算是在众多抗锯齿算法中比较昂贵的一种了,年代也比较久远,但是方法比较简单, 主要概括为两步 1. 查找边缘 2. 模糊边缘这是一种post processing的处理方法, 接下来我们就看看怎么实现查找边缘查找边缘的原因也是因为减少消耗,这样就可以只在边缘处进行超级采样,不必为全图进行采样了. 之前的文章详细说过三种查找边缘的方法Roberts,Sobel,Canny ,其中sobel最优,所以我们就是用sobe