The ChunkyPPC Library
=====================
--- snip ---
Newsflash:
This is a major new version of chunkyppc.library.
Note: The new functions are ONLY supported if you use chunkyppc.library
as library, using it with linkchunkyppc.lib does not support the new
functions !!! I included the old version of the linkchunkyppc.lib though,
it just does not cover the new functions of V3+. If you need V3+, use
it as Shared Library.
Mainly i included some new functions on request from *Hyperion Software* !!!
Hyperion wanted a OpenScreen/CloseScreen interface which can handle:
- Open on a Screen
- Open on a WB Window
- Open on a PIP Window
- Open on a Pubscreen
- use ASL Requester, not rtgmaster
- rtgmaster requester optionally
- change screenmodes of a running program "on the fly" (without having
to quit the program)
- automatically handle Chunky-Format Conversions
- Support for all 16 Bit Formats, RGB15 and LUT8 needed
- For AGA 8 Bit and HAM8 needed
- LoadRGB32 Feature with included Color-Adaption for WB-Window
- Doublebuffering (both AllocScreenBuffer and ScrollVPort) and
Triple Buffering Support
- Support for Custom Hooks both on 68k and WarpUP
- has to work on both 68k and PowerPC (WarpUP)
Note: AGA Mode claims to be 15 Bit, but is really 16 Bit :) Just always
use PIXFMT_RGB15 for AGA in ChunkyInit :)
Note: Executables compiled for chunkyppc.library V3-V11 (which were internal
releases anyways) don't work anymore on V11. This is V14. If you have a
Beta V3-V9, don't give it to anyone, to avoid confusion. Only versions
which should be given to other people are V14 or the old V2 ones. But
nothing "in-between".
Note: Due to the special implementation of EGCS-WarpUP the ms.algo cannot
be called directly with it. Use The CallChunkyCopy #define in the
include instead. For other compilers you can just
#define CallChunkyCopy ms.algo
Since V3.0 chunkyppc.library supports these features.
There are the following new functions:
OpenGraphics
------------
a0 : Title (of the program)
a1 : Mode_Screen structure
d0 : override flag
This function opens a screenmode-requester and after this a screen/window.
It uses the following env-variables:
/modeid:
if the override flag is *not* set, use this modeid, if the ENV Variable
is set (and if not, open a screenmode requester). If the flag is set, always
open a screenmode requester. So default a Screenmode can be used, once the
user saved one, but the screenmode can in-game still be changed.
Since V5 there is (on request of Hyperion Software) a override mode == 2.
If this is set, inside env: (and envarc:) there will be a
small Screenmode Database. OpenGraphics() then checks, if a modeid file
is already found in the directory, if this modeid file has the correct
resolution. This way it is possible to maintain several different resolutions
in one database, without the screenmode-requester popping up on every
screen-width/height/depth change (the names of the files are modeid,
modeid1, modeid2, ...). Up to 21 entries (modeid, modeid1, ..., modeid20)
are supported, if more are entered, modeid20 is changed always).
/dbuf:
Use Doublebuffering
/oldstyle
If this is set, ScrollVPort will be used, else Triple Buffering using
ScreenBuffers.
/wb
If this is set, a Workbench Window will be used, not a screen
/pip
If this is set, a Picture in Picture will be used using the Picasso96API.library.
Note: Not all Graphics Boards Support this. Currently only P96 PIP is supported,
not CGX. Maybe in the future...
/ham
Use HAM8 mode for AGA. If this is not set, 8 Bit mode is used for AGA.
/pipnoclear
If this is set to 1, the PIP won't be cleared when opened.
/rtgmaster
Use rtgmaster Screenmode Requester instead of ASL one. rtgmaster.library
will only be opened when this option is set, so that rtgmaster.library
does not need to be installed to be able to use chunkyppc.library.
/likecgx
If you want CGX-like WB Window Support for P96 (Faster, but Window Borders
disappear). For CGX it is automatically set, as it is the only possible
method for CGX to reach (at least for 16 Bit) WB Window Support, for P96
it is optional.
Hyperion Software will support a GUI to set these ENV-Variables for their
products, as they told me.
The Mode_Screen structure is a special structure containing information
on the screen, like the Screen-Structure of Intuition or the RtgScreen
structure of rtgmaster. It does not matter if you got it by OpenGraphics
or constructed it yourselves. The Mode_Screen structure also gets the
minimum/maximum Width/Height/Depth, so you should init these values before you
call OpenGraphics.
If the video_screen element of Mode_Screen is !=0 this screen will be closed,
and a new one will be opened ("on-the-fly-screen-change").
The important elements of Mode_Screen (like defined in clib/chunkyppc_protos.h):
video_screen: An Intuition Screen
video_window: An Intuition Window
bpr: Bytes Per Row
mode: The ModeID
SCREENWIDTH: Minimal Width/Actual Width
SCREENHEIGHT: Minimal Height/Actual Height
MAXWIDTH: Maximum Width
MAXHEIGHT: Maximum Height
MINDEPTH: Minimum Depth
MAXDEPTH: Maximum Depth
format: Video Format (using Cybergraphics constants)
video_depth: BitsPerPixel
screen: Buffer 0 Video RAM / Address BitPlane 0
screenb: Buffer 1 Video RAM / Address Bitplane 1
screenc: Buffer 2 Video RAM / Address Bitplane 2
bufnum: Active Buffer number
bitmapa-bitmapc: the Bitmaps
thebitmap: Active Bitmap
numbuffers: Number of Buffers (ScrollVPort only 2, no DBuffering only 1)
algo: Function-pointer to Chunky-Copy Algorithm
CloseGraphics
-------------
a0: Mode_Screen structure
d0: shutdownlibs Flag
This closes the Screen/Window again (and all other stuff... :) )
If shutdownlibs is set, the libraries used will also be closed.
If not they will not be closed.
LoadColors
----------
This is basically the same like LoadRGB32 of graphics.library or
LoadRGBRtg of rtgmaster. Set the colors of a 8 Bit Screen. Does nothing
for a 15/16 Bit Screen.
a0: Mode_Screen structure
a1: Color-array like for LoadRGB32/LoadRGBRtg
DoubleBuffer
------------
a0: Mode_Screen structure
Causes Double/Triplebuffering (depending on value of env:/oldstyle
and env:/dbuf). Note WB Window and PIP Modes are single buffered.
For Single-Buffered this does not do anything. For Triplebuffering it
switches buffers in the sequence (when the screen is just opened buffer
0 is always active !!!) <0 is active at startup>-1-2-0-1-2-0 or
<0 is active at startup>-1-0-1-0...
ChunkyInit
----------
r4: Mode_Screen structure
r5: srcformat
This initializes ms->algo with a PowerPC ChunkyCopy/c2p including Colorformat
Conversion according to src and destination format (ms->format...).
ChunkyInit68k
-------------
a0: Mode_Screen structure
d0: srcformat
This time a 68k function, and initializes ms->algo with a 68k algorithm.
If ChunkyInit/ChunkyInit68k returns 0, this means the source and the
destination formats are incompatible !!!
The following formats are compatible currently:
- for PowerPC all 8 Bit and 16 Bit formats, and RGB15 15 Bit Format
- on 68k 8 Bit and all 15 Bit and 16 Bit formats, but 15/16 Bit only
if source and destination have identical formats (68k currently
does not support format conversion)
- For AGA: On Workbench Window only 8 Bit, naturally
- For AGA: HAM8 mode only on PowerPC
- On Workbench Window: 15/16 Bit formats only supported for Picasso 96,
as WritePixelArray of CyberGraphX does not support a 15/16 Bit
Input Format. On a Screen it is supported also for CGX though.
If you want this on CGX too -> write the CGX authors about it.
The Chunky Algorithms
---------------------
68k:
a0: Mode_Screen structure
a1: dest
a2: src
d0: srcformat
d1: hook68k
d2: data
PPC:
r4: Mode_Screen structure
r5: dest
r6: src
r7: srcformat
r8: hook68k
r9: data
This performs a fullscreen copy of "src" to "dest" performing all
needed format conversion. srcformat is in CyberGraphics Synax
(PIXFMT_RGB16 or such). src is a Video RAM Pointer for GFX Board (you
have to take care yourselves, if that of buffer 0,1 or 2 has to be
provided). For AGA it is a pointer to the correct Bitmap instead !!!
After the Copy has been done, the hook68k will be performed using
a Contextswitch (or for 68k as function call), if hook68k is != 0.
If hook68k is 0 this will not be done. data is the parameter (void *)
of hook68k, and the function returns the return value of the hook
(or 0, if no Hook was provided).
Why this ?
There are some things which can't be run on the PPC:
- lowlevel.library
- Keyboard Code
- Sound Code
- Doublebuffering
- LoadColors
Most program authors perform one contextswitch for each of those.
Every contextswitch takes away 0.5-0.6 ms. Now it is possible
to put several of them together to one 68k function, and call this
function right after the Chunky-Copy, to reduce the number of
Contextswitches per loop to *1*. This is optional code, you still
can set the Hook-Parameter to 0.
Yes, i said before that Hooks are not possible for PPC. They
are possible, but only with a MixedBinary like chunkyppc.library.
To generate the 68k functions for the Hook you can do:
- Create a 68k Shared Library containing the 68k function
- or using a MixedBinary
I don't know which one Hyperion will be using. But generally i advice
people to use a library to avoid MixedBinaries.
There is an example about the MixedBinary feature consisting of the
two files test_new.c and test_new_68k.c. I fear, though, that it will
only compile with StormC, as it uses the "Automatic Contextswitch"
feature of StormC. You could also create the "myhook" function inside
a Custom 68k Shared library of course.
--- snap ---
And now follows the documentation about the features of the "old"
version V2 which of course still work :)
Well, why another Chunky-Library ? We have rtgmaster, after all...
Well, rtgmaster has one BIG BIG function for chunky-copy. Sometimes if
you know a certain special-case exists, you can specifically optimize
for this special-case and get some extra speed out of it. The disadvantage
is of course that you now have TONS of different chunky-copy-functions,
and get confused about which of them is which :)
Well, anyways: the chunkyppc.library works with rtgmaster (Just use
GetBufAdr(RtgScreen,0)
as Screenaddress
), it works with CGX, it works with Picasso96. It works with AGA. It works
with ECS.
It also contains all my experience in the field of GFX Board Coding.
chunkyppc.library exists in two forms:
chunkyppc.library is a PPC Shared Library for WarpUP (for newer versions of
StormC and for vbcc-WarpUP the linker-lib chunkyppc.lib is provided, for
older versions of StormC and for EGCS-WarpUP Includes are provided, which
serve the same thing).
linkchunkyppc.lib is a Linker Lib without Shared Library which serves the same
purpose, and supports the same functions. It cannot be used with EGCS WarpUP,
though, as EGCS WarpUP does not support vbcc-WarpUP-Style linkerlibs, like the
other two compilers... for EGCS WarpUP you have to use the PPC Shared Library.
The functions:
ChunkyNoffFast
ChunkyNoffFastest
ChunkyNoffNormal
Chunky-Copy with no destination offset and no source offset. Useful for
Fullscreen-Copy. Fastest assumes the width divisible by 64, Fast divisible
by 8, and Normal does not assume anything. Of course the more limited the faster...
ChunkyFast
ChunkyFastest
ChunkyNormal
The same like above, only now with Destination Offset Added.
ChunkyFastFull
ChunkyFastestFull
ChunkyNormalFull
Now both sorts of Offset (Source+Destination) are supported. Note, that because
some compilers - EGCS WarpUP in special - don't like many parameters for PPC
Shared Libs, i stuffed some parameters together into a
struct Soff
{
int x,y;
};
for this ... some of the future functions use this too... should be easy to
understand :)
All Chunky-Copy functions are 100% highly optimized PPC ASM. I *really* spent
time to optimize them best. Some of the c2p are only C (c2p_1 and c2p_2 are
PPC ASM). I guess c2p_1 is the fastest, then c2p_2, then c2p_3, then c2p_4,
but never really tested it...
Names in the fd-file:
address = Destination Address
data = Source Address
w = Width of the Blit
h = Height of the Blit
bpr = Destination BytesPerRow
x = Destination X Offset
y = Destination Y Offset
soff = Source Offset (struct Soff *)
sbpr = Source BytesPerRow
buffer = Destination Address
width = Width of the Blit
height = Height of the Blit
bm = struct BitMap * of the Screen
dest = Destination Offset (struct Soff *)
size = Width/Height of the Blit (struct Soff *)
plane0 = bm->Planes[0]
bitplanes = bm->Planes[0]
chunkyx = Width of the Blit
chunkyy = Height of the Blit
bitplanesize = (width*height)/8
depth = 6 or 8
helpbfr = width*height sized FastRAM Buffer
temp = width*height sized FastRAM Buffer
screensize = Size of the Screen (struct Soff *)
In the future there will be 15-24 Bit support, even with Byteswapping,
HAM8 Support and Masking Support.
There won't be a 68k Version. It would be too much work for my lazy self
to transform all that PPC ASM Code to 68k :)
Note: In the meanwhile there are also functions for:
16 BIT
16 BIT with Byteswapper
16 Bit with RGB<->BGR change
24 Bit
32 Bit
16 Bit with Swapper AND rotater still has to be done, as quite a lot of
possible 24/32 Bit changers. Also 16-24 Bit are currently only available
in the "normal" versions. ROT(RGB<->BGR) is completely untested.
For 15 Bit the 16 Bit functions can be used.
Who does write me docs ? :)
Look at the includes for the new functions for 16-24 Bit :)
Steffen Haeuser