Results 1 to 14 of 14

Thread: Lossy 3D Object transformation

  1. #1
    Member
    Join Date
    Feb 2011
    Location
    St. Albans, England
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Lossy 3D Object transformation

    As stated in various other posts where I ramble, my main focus is on 3D files.

    In standard 3D object files, there are a few simple things you can do to drastically reduce the file size. However the 3D arena being what it is tends to shun away from anything that is a "drastic" change, so i've kept altering files to a minimal and written a summary of how it works and intend to write plugins to support it.

    I would like to point out a few things;
    1 - I am not very good with writing these things, so please criticise away.
    2 - Keeping the transformation simple to understand an compliant to ASCII - which seems important to the art-world.
    3 - This is a prelude to writing an actual archiving system (mainly for myself - over 300Gb of files can easily be shaved down, and that is a SMALL collection).

    LossyOBJ.zip is a text document to explain how it works - please criticise away my style of writing & anything thats obvious !

    marine3.zip is the OBJ file I worked from. It's actually quite a small one compared to a lot of the ones I use.

    Again, i'm trying to achieve a transformation that is still ASCII compliant. This is only a tiny part of what I need before I consider writing an actual archiver (Poser has many of it's own formats - mostly ZIP ripoff's - that need to be addressed).

    Poser itself is a routine that generates a LOT more vertices than standard 3D files (on average, 4x more) due to the fact that it's heavily detail based - ditto for the JPEG/PNG files that it uses. Any saving is welcome
    Attached Files Attached Files

  2. #2
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    I don't understand the art-world's fascination with ascii. I import many model formats into my 3D engine, and most are text based. But I don't understand the fascination. You'd save so many bytes just by using a binary format. FP fomats can be defined like any other. Some formats are binary, and they work just as well as the others. And yet the new flavour of the year format is xml-based...

    For compression, an ascii to binary conversion - zip as you cite is not ascii either - could happen behind the scenes.

    Second, from a compression perspective, you can gain a lot by understanding how most tools produce 'auto-normals'. Then you only need to store those normals that are not re-creatable from the data; the vast majority the vast majority of the time can indeed be recreated from the vertices.

    And you can also model texture coordinates because its really rare to have reuse of texels and usually tex-coords maintain the spatial layout of the mesh they wrap etc. In fact a compressor might recompute the textures to reuse texels and, with an appropriate smudge factor, chop it up differently. But of course textures are usually POT so there'd not be much gain (unless you store a NPOT texture that you POTize on decompression...)

    In the kind of models I've been looking at, you can gain a lot by looking for symmetry and the re-use of sub-meshes. You typically only need to store the vertices for half a car, for example, and a note to mirror it on some plane to create the other half...

    In my world, most of the incoming artwork is too high quality and its level of detail filtering that leads to big lossy but acceptable gains in file size (which directly equates to graphics RAM at runtime).

    I put some thought into compressing models so as to gain RAM I could use for other things - like mipmaps etc - but in the end, I realised that the reason real games use ascii formats and such is because it just doesn't really matter. It didn't matter 10 years ago with MD2 and MD5 and so on, and it matters even less now...

    Here a random picture without explanation to show the kind of visualisation of my analysis of game files:

    Last edited by willvarfar; 12th February 2011 at 18:38.

  3. #3
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    ASCII formats are usually prefered by game designers just because of simplifying exporters (makes easy to read for detecting exporter bugs). Some companies switch binary variations as well (i.e. id Software did it for map formats in Doom3). Mostly game formats includes redundant data which can be computed at loading. But, companies prefer (especially in map formats) storing data in the file. I suppose, they want to reduce loading times. But, some properties should be stored even it "seems" redundant. The most obvious one is vertices "normals". Because, as programmer p.o.v., it can be calculated at loading. But, designers usually change "smoothed" normals to make some sharp edges. So, just forget it At loading phase, I prefer to calculate only tangent coordinates which are used by per-pixel lighting. The others are not very redundant, IMO. If you really want to do micro-optimizations, you can stick procedural geometry and textures. For example, you can use several predefined primitives (sphere, cube, and raw mesh etc) and set of deformers (bender, twister, CSG operator etc). In the end, you will be really amazed by the file size (in some case you can store 1 million triangles in a few bytes). Most of demosceners use that approch to reduce their executables to fit 4K or 64K category. You can extend that idea by using tracker formats for audio.
    BIT Archiver homepage: www.osmanturan.com

  4. #4
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    Specifically normals. I said that if you can recalculate the normals you don't need to store them. In my experience, at least for those models I've looked at, this is true.

    But imagine the world of smoothed normals and such. The likely values of a normal are fairly constrained, even if smoothed, and the precision of them is often overspecified. This predictability is perfect input to a compressor.

    The vertices could be modelled in much the same way. Are acute angles between faces common? Likely not. That the corner was acute in a certain direction of already compressed input a good indicator that this following edge is acute? Yes, it likely is, given that edges tend to continue. And so on.

    Additionally the order of the vertices can often be sorted in some spatial way without affecting the correctness of the model, which can then allow greater predictability and therefore compression.

    Regards procedural compression - yes, things like 3DS where you tend to start with solids and then deform them in a modifier stack is better compressed by storing the solid and the modifiers, rather than the resultant mesh. But its not so trivial to start with a mesh and determine an efficient set of modifiers that return it to a simple solid.

    Lots of models are made with a lot of copy and pasting and mirroring and such, and you can definitely find a lot of that trivially and re-define parts of meshes in terms of mirroring and simple transformations of other parts of other meshes.

    I did write some code that went looking for this stuff, and found a nice amount of such trivial redundancy easily in the game data I was studying.

    But of course back to the original question, a lot depends on the ratio of geometry and textures you are trying to compress. A lot might be done by massaging your textures to reduce their size rather than considering the geometry info.

    In fact, thinking about it, a simple de-dup (well, 'simple' is suggesting its not an art ) will likely massively reduce any artwork library of models and textures?

    I wrote a tool that de-dupes the game data for Glest: https://github.com/williame/GlestToo...st_mod_pack.py

  5. #5
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well, again most of games are already do something like you mention (not fully of course). For example, MD2 (Quake2 model format), MD3 (Quake3 model format), MDS (Return to Castle Wolfenstein model format) are all uses quantized normals. IIRC, 2 bytes encoding used for representation of 3 floats (=single normal). Because of quantization, it makes more compressible. Another thing, in all quake-based map formats, map compilers always reduce vertices precision (IIRC, it's 1/. There are two advantages of that, 1-you can eliminate float precision errors which sometimes causes holes between triangles, 2-you can increase compressibility.

    Another thing which is somewhat lossless, you can represent a rotation matrix (3x3x3 floats) by a quaternion (3 floats). You can use angle decomposition but this is more precise.

    As to procedural geometry, I didn't mean approximating the given arbitrary mesh. Instead, I meant something like that: http://www.theprodukkt.com/, especially http://www.theprodukkt.com/kkrieger which is just 96k game. It's totally something like CAD formats. For example, create a cylinder with defined radius, height, rotation and height segments; then apply a bender to make a exotic building pillar You just need to store radius, height, rotation segments, height segments and bender parameters. As you guess, it's <100 bytes range while generated mesh can be over several megabytes. You can also use for textures and audio. As to audio, it's already common (i.e. trackers).

    If you want to play around Kolmogorov complexity, today's hardware allow that With pixel shaders, you could already generate textures on-the-fly by procedural manner with written C-like scripts. As an extra, now you can create arbitrary geometries with geometry shaders with decent hardwares

    In short, there are lots of possibilities in game-world. You just need to stick which you favor
    BIT Archiver homepage: www.osmanturan.com

  6. #6
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    EwenG, you are keen to describe your transforms as 'lossy' in the sense that they produce non-byte-identical but information-identical files. I think we might term that 'optimisation'? You are making a 3D model optimiser? At the moment, this optimisation for storage purposes is the tidying up of whitespace?

    You are careful to say your transition has little computational overhead and such and much of the journey that osmanturan and I went off on is perhaps not that.

    I have played with compressing 3D models just before christmas, and these game models were not unlike the example you showed. In that example, you see that it essentially mirrored on the vertical axis. It is not impossible to detect trivial instances of this.

    I should guess that an OBJ -> binary format with modelling of values, mirroring detection of tweens in animation and things -> OBJ would not only optimise it but also dramatically reduce the file size footprint by orders of magnitude. That is my guess.

    So, if you move into storing a binary version of the file, with a predictive model of how meshes are made and other optimisations like discarding unneeded precision in fp representation and such, I think that what you pay in project complexity you would gain in disk space!

    You have talked in another thread about the textures you deal with: http://encode.ru/threads/1217-how-th...ll=1#post24034 - I think the answer to your question is that a compressor that understood the relationship between the models would make a dramatic optimisation.

    Generally, it is my assertion that normal 3D models display a dramatic conformity to a natural, spatial world - a physical sympathy. In principle, a mesh could contain random points linked in random ways but that would appear like a very spiny sea urchin. Models are very very predictable, the way the textures are tessellated is very predictable, the normals are very predictable, the faces and so on. Its just that conventional compressors don't understand enough about what they are modelling - that it is 3D geometry, even - to discover this redundancy. If you built a compressor that understood what it was compressing 3D geometry or corresponding textures, you could have real success predicting where in spatial space the next vertex will be, what the corresponding texture coordinate is, what the normal is and so on. And in many models that represent machined things you can usually find lots of sub-mesh duplication that can be expressed by indices and a matrix.

    In the game files I've looked at, 60% of the space was textures and 40% models. In that kind of world, you could make orders of magnitude improvements over simple zip+xz compression (which is the fun combo that I decided upon for packing game data, after deciding that clever optimisation of models, whatever benefit it gives, wasn't worth spending time on in a world with broadband for me. I have a hobby game to write!).

    Osmanturan raised good points on the tangent and if its ok I'd like him to say more. I've got procedural generated trees and planets in my game using my own flavour of souped-up randomised eisenscript, and I'm thinking now about procedurally generating villages and other scenery (although I don't yet do much in the shaders; I tend to spend my shading budget on the classic stuff already sadly). So all the links are very nice to follow
    Last edited by willvarfar; 13th February 2011 at 19:15.

  7. #7
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    At first look, procedural city (in your case village) generation seems a bit odd. But, it's already studied by lots of researchers. There are even procedural city generator for 3D editors (i.e. CityGen). I've used CityGen for Cinema4D in past. Also, there are some paper about how to generate cities with procedural manner. Here is one of them: http://www.citygen.net/files/citygen_gdtw07.pdf. Also, here is an article which is much simpler: http://www.shamusyoung.com/twentysidedtale/?p=2954. For terrain, you can use perlin-noise based terrain generation. As result, you can build a large landscape with almost no-cost

    As to texture generation, I recommend to think about multi-layered generation. For example, suppose you have only checkerboard and perlin-noise. If you define weight and blending filter for each layer, you could generate a floor texture easily with only two functions. BTW, another idea which is common, you can use those information to generate normalmaps and heightmaps for parallax perpixel lighting.

    A very tiny but useful detail could be using deterministic random process. I mean use a random function which behaves deterministic for each run. That way, you can exactly recreate your world in lossless sense. As a small optimization, you can use mersenne twister for speed reason.
    BIT Archiver homepage: www.osmanturan.com

  8. #8
    Member
    Join Date
    Feb 2011
    Location
    St. Albans, England
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for all the great input.

    I used the term "lossy" mainly for the purists who will moan that their CR LF's won't be "in the same order", removal of extra spacing, and comments (which I really don't see why they exist). It's a trivial transformation & optimisation.

    The format I use is the Poser. This is heavily intensive on detail, so there is always a mass of redundancy in the files (a Blender file, for example, would contain only 20% the amount of a Poser version). Converting it to binary would be a serious boon - however introducing something so "drastic" would make most of the art-world cry. I currently have a collection of these files eating up around 300Gb - whereby converting & optimising their various ASCII formats & JPEG/PNG files would reduce that immensely.

    Willarfar is correct on the point of "mirroring". To my knowledge, most models indeed mirror themselves to some degree. Sometimes there are extra details here & there, but his rule is constant in the models I have.

    I picked on the OBJ format here because of how widely used it is. However that is just a tiny part of a bigger routine I will be working on. If you wish to see what i'm referring to, then grab the Victoria 4.2 base model from daz3d. Once it's installed, the files are 26.5Mb in size. Then you need to run their "initialise" routine which unpacks their data to a massive 26.7Mb (yes - they compress it to save .2Mb! The decompression routine is around 3Mb). In the files from DAZ3D - they use the BitRock installer which is LZMA2 compression (for text & images), whereas other companies release models using a plain ZIP on the files. So yes, a hell of a lot of redundancy in these files. BitRock, incidently, leaves an overbloated file for uninstalling, which is left on your HDD. It also seems to cause an error when uninstalling a few files at a time - but that's another story.

    Already there are many great routines to address the images by optimising the DEFLATE. In respect to the OBJ files, keeping all the essential data intact is paramount to the format I use. However, there are occasions whereby you can use the textures as "hints" to the OBJ file itself. Furthermore, you could go one step beyond that and use the OBJ file to "trim" the texture files too.

    Another annoyance of the Poser format is their multitude of files, such as the PZ2 format (ASCII) which looks something like;

    Code:
    {
    version
    	{
    	number 6
    	}
    	{
    	// This file calls all _ERC poses in all sub-directories.
    	}
    
    	readScript ":Runtime:libraries:!DAZ:Victoria 4:hip:01-ps_pe069_DAZ_ERC.pz2"
    	readScript ":Runtime:libraries:!DAZ:Victoria 4:abdomen:02-ps_pe069_DAZ_ERC.pz2"
    	readScript ":Runtime:libraries:!DAZ:Victoria 4:chest:03-ps_pe069_DAZ_ERC.pz2"
    	readScript ":Runtime:libraries:!DAZ:Victoria 4:neck:04-ps_pe069_DAZ_ERC.pz2"
    	readScript ":Runtime:libraries:!DAZ:Victoria 4:head:05-ps_pe069_DAZ_ERC.pz2"
    I'm sure you can see how even a simple transformation would benefit this kind of file. Yes, they are folder & file pointers. It makes lots and lots of folders *sigh*

    My aim is two-fold.
    1. To create a way to archive these files using optimisation, transformation & compression. I have around 15,000 of these files archived with a bulk being "zip" format (or "rar" format where i've quickly repacked them for archiving) which makes installing them an annoyance. To make things simple for myself, i'm aiming to create my own "installer" for these, so no plug-ins needed for this type of archiving. This is something for another thread.

    2. Which is what i'm doing here. Create an open-source format that is "seemingly instant" in transformation, that artists can still "look at the big numbers in notepad", and is easy to port to any 3D routine or platform. That way the purists won't argue too much. We all know that converting these files to binary would be a massive step - and maybe it is the way to go forward. Surely the day of ASCII in art is dated.

    In my own opinion, the OBJ format is way too redundant - however using decent transformation & compression and stating "This will save you a lot of time & space" would be similar to handing a Ferrari to a caveman. I guess a lot of people just like to be able to look at the big numbers in notepad.

  9. #9
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    EwenG, I can't really understand what you plan to offer to entice people to adopt your format.

    From your original text file you attached:

    With this example, after transforming the NumVerts block (ONLY) by removing double spacing, and CR+LF+"v", the results became;
    Original: 707,908
    Transformed: 591,675

    Testing this method in archiving - I used the 2 normal methods they are archived
    Zip results
    Original: 188,525
    Transformed: 176,972

    RAR results
    Original: 166,380
    Transformed: 162,718
    Clearly that wasn't the marine.obj file that you attached.

    Anyway, the marine.obj zips to about 1.2MB but with 7z its just 800KB. 7z has special handling for tabular data and it really helps with this kind of input. (ZPAQ gets under 700KB)

    Its worth noting that normal LZ compressors can just as easily compress floats expressed as strings as floats expressed in binary - simply going binary will half (or better) the file size but will not impact the compressed-with-ZIP/7Z/RAR size of an archive.

    ZIP/RAR are simplistically binary versions of the file. If you make a compressor that understands geometry so can make modelling-specific optimisations as we described, and many others, and then push it through a CM, then you will make a compressor that is many times better at compressing 3D models than the general-purpose compressors that are LZ and low order based. And you can decompress to an optimised OBJ or 3DS or BLEND or MD5 or whatever output. This would have to support the superset of formats - basically be collada compatible - but there are some tricks for safely handling formats with blocks that you don't understand by compressing what you can extract and compressing the rest as a delta.

    You could go further - you could make it not only a model compressor but a model converter. And you could gain utility by good shell integration so it works seamlessly - transparently even - with a variety of 3D toolchains.

    And regards the images, there are masses you can save over JPEG compression and it may be as simple as starting with H264 intra transcoding. And when that gives startling results, you might wonder why I suggest that for your particular images and how that might be rather more specialised to your kind of input to generate substantially better compression still. But you won't get out a byte-identical JPEG after, you'll just get out an equivalent JPEG when you decompress.

    If artists think they gain anything by ASCII formats they have less mechanical sympathy than they fancy. Best to not tell them, and just say its like ZIP only for models and its much better - for models.
    Last edited by willvarfar; 15th February 2011 at 13:08.

  10. #10
    Member
    Join Date
    Feb 2011
    Location
    St. Albans, England
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Firstly, I apologise - clarity is not my strong point.

    The results were on the vert's alone (the first "block" as it were) to validate the reason for transformation.

    You are also correct in asking what would make it appeal to the art community in general. The idea is for a transformation that "keeps the result in ASCII". I know it seems "stupid" doing it this way, but it's the first step to making a decent routine. I find most of these artists like to keep it in ASCII "for portability". The code is available to anyone who wants it, simple to understand what it does & integrate into any product. Although my explanations can be rather awkward to read (autism can be a pain like that).

    Consider this a "smaller step" for a much more complicated routine later on.

    The main idea is to present it to you guys who obviously know a GREAT deal more than me, and look for feedback into the best way to slowly drag the 3D art world out of the dark-ages into at least the middle-ages (hence keeping it all ASCII - which is less a shock to them).

    As for compression. I fear that these people will stick to using Zip & RAR formats. I had a lot of trouble asking people to switch over to WinUHA or PeaZip because people tend to be "stuck in thier ways", hence why i'm trying to find a "happy middle ground" that works to everyones advantage. Although i'm starting to think it'd be easier to split the bit.

  11. #11
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    I was hoping you'd go in the understanding geometry and texturing direction to lead to much better compression.

    I don't buy that the same data in a marginally abbreviated format is going to make a rewarding gain.

    There's lots of proper research into simplifying meshes (jargon: level of detail), compressing them as in predicting values, and very neat research into approximate symmetry.

    If there is work on texture compression using awareness of the mesh it is wrapping in order to better model it, I am unaware of it. But it likely exists.

    Yet nobody has really put it together into a generally available utility.

    I implore you to be he person that does that!

  12. #12
    Member
    Join Date
    Feb 2011
    Location
    uk
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I was just thinking about how this is only possible when material intersection is allowed, and I realised that this is possible in 4 dimensions (causing a 3D object to intersect itself), so then isn't the premise for which you allow the sphere to intersect itself (higher dimensional transformation) the same as allowing a circle to evert by moving it 3-dimensionally?

  13. #13
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,462
    Thanks
    8
    Thanked 37 Times in 27 Posts
    We're still far, but on the way.
    This is the smartest one that I've seen.

  14. #14
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    184
    Thanks
    3
    Thanked 3 Times in 1 Post
    http://www.cs.sunysb.edu/~stripe/ Stripe can half the size of Wavefront OBJ files both before and after compression. Your marine3.obj becomes 533K with Stripe+xz, 445K with stripe+zpaq.

    Stripe is not doing any symmetry tricks..

    Also, its not doing any relative coordinates for the vertices (e.g. relative to last vertex on first use)

    Combined with symmetry searching and relative coordinates, you can only imagine...
    Last edited by willvarfar; 24th February 2011 at 15:31.

Similar Threads

  1. WebP (lossy image compression)
    By Arkanosis in forum Data Compression
    Replies: 52
    Last Post: 17th April 2012, 13:53
  2. Transformation for UIQ2
    By Jan Ondrus in forum Data Compression
    Replies: 49
    Last Post: 4th October 2009, 18:30
  3. BMF is not binary lossless NOR pictore lossy
    By SvenBent in forum Data Compression
    Replies: 4
    Last Post: 23rd August 2009, 13:54
  4. Delta transformation
    By encode in forum Forum Archive
    Replies: 16
    Last Post: 4th January 2008, 13:13

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •