-
Notifications
You must be signed in to change notification settings - Fork 9
Status
PSPGL is very incomplete. I've tried to make what has been implemented work correctly in a useful way, but there are likely lots of bugs. This is a discussion of what does work (or at least, how its supposed to work). The TODO section below talks about what I'm planning.
Please report any bugs you find to me, preferably with some sample code which demonstrates the problem. A register state dump is also very useful (enable these by setting the #if 0 to #if 1 around line 42 of pspgl_misc.h and recompiling PSPGL+your program).
A quick list of PSPGL features:
- texture format conversion PSPGL accepts a wide range of common texture formats, and will convert them into hardware format.
- compressed/paletted textures
- vertex array format conversion You can use the normal gl*Pointer calls to configure arrays in any way you like (though using native hardware layout will be most efficient)
- compiled vertex arrays The CVA extension allows vertex array copying/conversion to be cached for multiple uses of an array.
- vertex buffer objects VBOs allow vertex data to be used directly without copying or conversion.
- EGL config selection works You can choose 16 or 32 bit framebuffers in a number of configurations, and decide whether or not to have a depth buffer
- extensions for PSP hardware features GL extensions allow PSP hardware features to be exposed in an efficient way without breaking programs which don't know/care about them.
- Automatic mipmap generation in hardware
Reading back state is not supported. Some state is readable, but a lot is not. Use the glGetIntegerv/glGetFloatv call appropriate to the type of data you want to fetch; they do not return the same set of state.
Complex conversions are not supported. It will perform enough data conversion to let you get away with not knowing the precise hardware format, but it won't do anything heroic. Things like automatic compression are right out.
Display lists are not supported. There is some non-functioning vestigial support. I'm still unsure whether they can be made to work in a sufficiently efficient and lightweight way.
PSPGL supports specifying geometry in immediate mode within a glBegin/glEnd pair. Immediate mode is useful for writing programs quickly, debugging, instrumentation, etc, but it is never going to be a high-performance path.
The following primitives are supported:
- GL_POINTS Only 1 pixel points
- GL_LINES Only 1 pixel lines
- GL_LINE_STRIP OK
- GL_LINE_LOOP Behaves the same as a LINE_STRIP: the closing edge isn't drawn
- GL_TRIANGLES OK
- GL_TRIANGLE_STRIP OK
- GL_TRIANGLE_FAN OK
- GL_QUADS not supported
- GL_QUAD_STRIP not supported
- GL_POLYGON not supported
Vertex arrays are the preferred interface for submitting geometry information. Directly using OpenGL 1.1 vertex arrays will have a performance advantage over immediate mode simply because there are fewer function calls, but there are higher performance techniques.
glArrayElement offers little advantage over the normal immediate mode calls; use glDrawArrays, or glDrawElements.
Use the smallest vertex types possible: use GL_UNSIGNED_BYTE rather than FLOAT for colours; use SHORTs rather than FLOATs for vertex and texcoords. Don't enable unused arrays (for example, don't enable a normal array unless lighting is enabled; there are exceptions though). Use BYTE or SHORT indices rather than INT. Use the [native vertex format][NativeVertex] where possible.
Use strips and fans where possible, though indexed independent triangles are still pretty efficient.
Note: there's no need to do any explicit cache flushes or use uncached pointers when passing vertex pointers into PSPGL. PSPGL will do all the appropriate cache management for itself.
PSPGL supports the CVA extension. This allows you to set up the vertex arrays, and then call glLockArraysEXT. This will cause PSPGL to convert the vertex data into hardware form, and cache it in the PSP GE's local memory. You may then use the vertex data with multiple calls to glDrawArrays/glDraw(Range)Elements.
There's probably no advantage in using this unless you use the array multiple times, and don't lock too many unused vertices (ie, only lock the part of the arrays you're actually using). You must unlock the arrays with glUnlockArraysEXT before updating the array pointers with gl*Pointer; if you don't, the pointer update will be effectively ignored. Similarly, changing the contents of a locked array will have no effect.
It will only lock up to 128k of vertex data. If you try to lock more, the call will have no effect. Try to keep your vertex arrays under this limit, or use a vertex buffer object in native form.
(Note: you'll need to carefully read the specification of the ARB_vertex_buffer_object extension [see references] to understand the discussion below. You can safely ignore all this though.)
PSPGL implements the VBO extension (now part of OpenGL 1.5). Buffer objects are a powerful general way for an application to have more control over how OpenGL allocates and uses memory, and are used for more then just vertex arrays. However, vertex buffer objects are very useful.
Buffer objects are primiarily useful when you arrange your data into a form which is directly usable by the hardware without any conversion. If you do this, then a buffer object allows the hardware to directly DMA the data out of the buffer without conversion or copying. Unfortunately the PSP is a bit more rigid than other 3D hardware about what vertex and texture formats and arrangements in memory it will accept, so this takes some care.
PSPGL currently ignores the "usage" parameter of glBufferDataARB, but the intention is that it will be used to decide whether data will be placed in system memory or EDRAM (though ultimately buffers will tend to migrate between the two memory pools).
Because the PSP uses a MIPS CPU without coherent caches, the caches must be managed in software. PSPGL will do this for you, but only if you use the API correctly. This means that you have to be careful to use the glMapBufferARB/glUnmapBufferARB functions properly. PSPGL tries hard to raise GL errors if you use a mapped buffer as an argument to a GL call, so be sure to check for errors when debugging VBO code. NEVER use a buffer pointer after you've unmapped it.
The "access" parameter of glMapBufferARB is used to determine how the cache is treated for the new memory:
Mapping access | Mapping type | Map action | Unmap action | Notes |
GL_WRITE_ONLY_ARB |
Uncached | sync with hardware if busy | - | The mapping is uncached to help prevent cache pollution; reads will
work from a write-only mapping, but they'll probably be very slow. If you're replacing
the entire contents of the buffer while the hardware is potentially using it,
it is more efficient to use glBufferDataARB to replace the buffer with a new
one, because this doesn't require waiting on the hardware. |
GL_READ_ONLY_ARB |
Cached | - | cache is invalidated but not flushed | This will still be writable, but writing to it may cause very strange, non-deterministic results; the cache lines may not be flushed to memory for the hardware to see, or they may be discarded without ever being written (giving the appearance of data which "sticks" for a while, but then reverts to its old value). |
GL_READ_WRITE_ARB |
Cached | sync with hardware if busy | dirty lines are flushed, and the cache is invalidated | Safe for all usage, but not as efficient. Use only if you really need to have read-write access to the memory. |
In general, the assumption is that buffer objects are intended for buffers shared with hardware. They are kept in CPU cache only while mapped for access by the CPU, and are otherwise evicted from the cache. This leaves the CPU cache free for other data, and makes sure the hardware always sees a consistent view of the memory. If you don't put your arrays in buffer objects into a form which is directly useful to hardware, you end up using buffer objects like an inefficient form of malloc with bad cache characteristics.
Note that it is always an error to map a buffer while it is still in use; PSPGL enforces this by raising an error if you try to use a buffer while its mapped, and waiting for the hardware to finish if you create a writable mapping.
To arrange a vertex array in native vertex format, you must specify your arrays in the following order in memory (leaving out any array you're not enabling), with sizes and types as follows:
Array | Types | Size |
GL_TEX_COORD_ARRAY | GL_BYTE, GL_SHORT, GL_FLOAT | 2 |
GL_WEIGHT_ARRAY_PSP | GL_BYTE, GL_SHORT, GL_FLOAT | 1-8 |
GL_COLOR_ARRAY | GL_UNSIGNED_BYTE | 4 |
GL_NORMAL_ARRAY | GL_BYTE, GL_SHORT, GL_FLOAT | 3 |
GL_VERTEX_ARRAY | GL_BYTE, GL_SHORT, GL_FLOAT | 3 |
See the Wiki for more details.
Note: Your array of vertices must be packed, so there is no non-vertex data between each vertex. Also, you must enable all these arrays with glEnableClientState for them to be considered as part of the "native format" check. For example, if you sometimes need normals and sometimes not, then it is probably better to always keep the normal array enabled; if you disable them when you disable lighting, then PSPGL needs to copy and re-pack your arrays when you use them, which is likely more expensive than simply letting the hardware ignore an unused normal element in each vertex.
You may also put index data into a buffer object, using the GL_ELEMENT_ARRAY_BUFFER_ARB target. This is pretty straightforward. The only constraint is that you use GL_UNSIGNED_BYTE or GL_UNSIGNED_SHORT as your index type; the hardware doesn't seem to support 32-bit indices.
PSPGL supports only 2D textures with power-of-two dimensions. 1D textures can be easily emulated with a Nx1 (or 1xN) texture; 3D textures are just not available.
PSPGL will convert textures you provide into the native hardware format, but it will only perform a limited range of conversions. In general, it will rearrange bits in a pixel format, but it won't convert the type or format of a texture. Therefore, if you want an GL_RGB internal format, you must provide a GL_RGB format texture.
glTexImage2D and glTexSubImage2D work mostly as expected, though glPixelStore has not been implemented, so all textures are assumed to be tightly packed in contigious memory (no 4-byte rounding for each row either). PSPGL will make a copy of your texture data, so you can free/overwrite/reuse your copy immediately. PSPGL will also manage cache flushing, etc, so you don't need to.
Mipmapping mostly works as expected. GL_GENERATE_MIPMAPS texture parameter is supported, so you can get the hardware to generate mipmaps for you. It also supports a GL_GENERATE_MIPMAP_DEBUG_PSP flag, which will add tinting to each mipmap level to make them easier to see. Automatic mipmap generation only works for RGBA texures - not compressed, indexed or luminance/intensity formats.
The PSP hardware seems to have a bug where a texture viewed from a particular angle will use larger mipmaps than necessary, which means the appearance will change on screen as view angle changes, and there may be a performance impact.
The following types and formats are supported:
Format | Type | Hardware bytes/pixel | |
GL_RGB | GL_UNSIGNED_BYTE | 4 | |
GL_RGB | GL_UNSIGNED_SHORT_5_5_5_1 | 2 | |
GL_RGB | GL_UNSIGNED_SHORT_5_6_5 | 2 | |
GL_RGB | GL_UNSIGNED_SHORT_4_4_4_4 | 2 | |
GL_RGB | GL_UNSIGNED_SHORT_1_5_5_5_REV | 2* | |
GL_RGB | GL_UNSIGNED_SHORT_5_6_5_REV | 2* | |
GL_RGB | GL_UNSIGNED_SHORT_4_4_4_4_REV | 2* | |
GL_BGR | GL_UNSIGNED_SHORT_5_6_5 | 2* | |
GL_RGBA | GL_UNSIGNED_BYTE | 4* | |
GL_RGBA | GL_UNSIGNED_SHORT_5_5_5_1 | 2 | |
GL_RGBA | GL_UNSIGNED_SHORT_4_4_4_4 | 2 | |
GL_RGBA | GL_UNSIGNED_SHORT_1_5_5_5_REV | 2* | |
GL_RGBA | GL_UNSIGNED_SHORT_4_4_4_4_REV | 2* | |
GL_ABGR_EXT | GL_UNSIGNED_SHORT_4_4_4_4 | 2* | |
GL_LUMINANCE_ALPHA | GL_UNSIGNED_BYTE | 4 | |
GL_LUMINANCE | GL_UNSIGNED_BYTE | 1* | |
GL_ALPHA | GL_UNSIGNED_BYTE | 1* | |
GL_INTENSITY | GL_UNSIGNED_BYTE | 1* |
Formats marked with * are native and require no conversion.
The PSP hardware supports paletted textures, and PSPGL exposes that with the EXT_paletted_texture extension. Each texture object has its own colour map, which is set with glColorTableEXT using the GL_TEXTURE_2D target. The colour table can be in any of the GL_RGB or GL_RGBA format/type combinations supported for textures. The paletted textures themselves can use a format of GL_COLOR_INDEX4_EXT, GL_COLOR_INDEX8_EXT or GL_COLOR_INDEX16_EXT. There's no way to share a colour table between textures.
glTexSubImage2D does not work properly on 4-bit/texel paletted textures.
The PSP hardware supports DXTn (aka S3TC) compressed textures. PSPGL implements the glCompressedTexImage2D call to allow compressed textures to be copied into GL. It supports these compressed formats:
- GL_COMPRESSED_RGB_S3TC_DXT1_EXT
- GL_COMPRESSED_RGBA_S3TC_DXT1_EXT
- GL_COMPRESSED_RGBA_S3TC_DXT3_EXT
- GL_COMPRESSED_RGBA_S3TC_DXT5_EXT glCompressedTexSubImage2D has not been implemented.
Texture matrix transforms don't work.
Texture coord generation (used for environment mapping, projectors, etc) are not implemented. It seems from looking at the implementation of environment mapping with libgu, some of the lights are used as parameters for texcoord generation, which means that it will interact with lighting (you lose some lights while generating texture coords).
When you create a GL context with EGL, you can specify what configuration of framebuffers you want bound to that context. You can find an appropriate configuration with eglChooseConfig, which will return a number of configurations which match the attributes you specify. If there are no valid configurations, then it will return none.
You can use this mechanism to specify whether you need an alpha channel, how many bits of RGB channel, whether you need a depth buffer or a stencil buffer. Note that in the PSP hardware, the stencil buffer and the destination alpha channel share the same space, so you can have one or the other for a given GL context.
GLUT is simpler to use, and so does not let you specify fine details of the GL configuration; it will always request 8 bits of RGB. You can use the GLUT_RGB, GLUT_ALPHA, GLUT_STENCIL and GLUT_DEPTH flags to glutInitDisplayMode to specify a buffer configuration.