New "Game" - Terrarium: Flora

ScarPixel · August 31, 2020, 2:35am

Introducing Terrarium: Flora

It has been two months since I last updated my WIP called Terrarium. At the time I was at storage capacity and really struggling to find space to add new features. The easy thing would have been to just call it done and move on, but it felt so…unsatisfying.

As I stared at the code for the gazillionth time I realized I could optimize the water system by simplifying some underlying assumptions. Unfortunately this would have been difficult to retrofit into the existing code. So I started over

I basically rebuilt the project from the ground up with storage space in mind - something I didn’t do with the original. Note I include blinklib in the project because I modified it to save a bit of code space.

https://github.com/scartier/terrarium2

Some notable changes for those familiar with the old version:

All code is contained within one sketch. No more asymmetric tiles.
Drippers and dirt no longer take up a full tile.
Sun tiles and sunlight in general has been removed. It just wasn’t fun to manage given the code space it occupied. (Ask me about the rainbows I had implemented at one point.)

In addition, since plants are the spotlight of Terrarium: Flora:

Two new plant types: vines and seaweed. These are in addition to the standard tree-like plant.
Vines grow when the dirt is rotated downward before growing has started. Vines will always grow down.
Conversely, seaweed grows up, but only from a tile that is submerged in water. It will always stay under the water level.
You can tell the difference between the plant types by their leaf color.
Trees will now only begin growing when the dirt tile is facing upwards.

How to play

This project really shines when you have a lot of tiles. Use as many as you can!

Once your tiles are programmed you’re going to want two things: water and dirt.

Dripper

Double-click a tile to turn it into a dripper.

Drippers have one face colored cyan. Water will drip down from this like an icicle. You can control the rate at which it drips by clicking the tile. There are three dripping rates.

Note that the dripper tile also determines the direction of gravity. It will tell all the other tiles which way water should fall. If you have two drippers in the same cluster that are facing different directions things will probably not behave as you expect!

Water in all tiles evaporates at a slow rate. This prevents a single dripper from filling up the world. Unless you want that to happen! If so, turn the dripper on full blast (and maybe create a second or third).

Dirt

Double-click a dripper tile to switch it to dirt.

The bottom three faces are filled with dirt. Water that falls onto these faces will be absorbed, up to a point. Eventually they become saturated and start leaking.

Normal dirt tiles won’t do anything. But, if you click them you toggle them into a “fertile” state, as represented by a slight green tinge to the center face.

Fertile dirt tiles are ready to start growing a plant. All it needs is a water source.

Growing your first plant

Put all this together and you’ll get your first plant!

Plants sprout out of the faces next to the dirt. As your plant grows it will expand into neighboring tiles. Give it the room to grow and it will get rather large!

Tips

Double-clicking a dirt tile will change it back to a normal blank tile. Be careful, though, as this will quickly kill any plant that was growing.
Clicking a tile quickly five times (five-clicking?) will erase everything in it and reset its state. Useful for controlling plant growth, eliminating excess water, or just starting over.
You can graft plants by moving a grown branch from one part of a plant to another (or to an entirely different plant).
Unhappy plants will wither and die. This may happen if they lose their water source or get too much water and become submerged. Vines and seaweed will die if rotated opposite how they should grow.
Sprout a flower on your tree and you might be rewarded with a flying critter for your world.

Enjoy!

Dudepants · August 31, 2020, 2:12pm

Hey, I’m just curious. (and this should really be in it’s own stickied thread, but…) What were some of the biggest pitfalls in memory consumption you found? I think having a best practices for lower-memory alternatives would help for people making their first games, like me.

So far, I found some of the biggest sinks of memory are standard Math library functions. So, things like floor, ceil, min, max…even %(mod) are pretty taxing on the system. In some cases, I shaved like 5% of my memory usage off by changing a situation where I was using a %(mod) to having a variable counter that I just reset when it hit the multiple I wanted. It’s WAY less elegant, but it apparently is easier on the compiled memory size.
Another thing I noticed is if you can limit the calls to setColor (and it’s variations), I would methodize this out and abstract it where you can. Calling this a bunch of times will chew up a ton of memory, too.

Anyway, the biggest sink for me was definitely math calls so far. Just curious what you found.

Freddicus · August 31, 2020, 2:39pm

Your code - wow. I need to take notes! Is your primary C/C++? My primary is Java. I haven’t touched C/C++ since college ('01 - '06), and it shows in my yucky code. I’m just kind of stumbling through.

Between your game and mine, we can add a third game for Spring 2021 games. Get three growing / nature themed games out there for the season!

Anyhoo - congrats! I can’t wait to try it out once I get my KS2 Blinks / Dev Kit!

Dudepants · August 31, 2020, 5:15pm

Yeah, I’ve primarily only used Java and Kotlin for the past few years. It really is a detrimental language to get stuck on using. I was lucky to learn C++ as my core language in college and used it professionally for a few years, but a lot of schools teach Java for computer science. I don’t see how they can do that and expect anyone to understand pointers or memory management. If you never worried about how much memory you were using or the size of your data, chances are, you never bit shifted or masked, either. Guessing that’s a bit of learning curve for anyone that’s only used Java. (good thing you’ve used C/C++ before)

ScarPixel · August 31, 2020, 6:03pm

Sure, I’ll type up some more dev-centric notes about this release, including some strategies I used for storage savings. I put some reminder notes for myself at the top of the source code, if you want to get a peek at some of it.

I’ve stayed in the C-variants for most of my career: C, then Obj-C, then C#, then C++, back to C#, and now C again for Blinks (full circle ha!). I’ve also done some low level assembly a few times. Tempted to try some inline assembly for this project to see if it helps and also to refresh my skills a bit.

I’m down for a set of Spring-themed games! I have one more minor update to this one planned (fourth plant type). Then a follow-on/sequel that you can probably guess based on the title of this one. I also have my next project after Terrarium swimming in my head and it’s killing me not being able to start on it. Soooooon

ScarPixel · September 2, 2020, 4:13pm

Here’s some strategies I used to reduce storage usage. I’ll try to generalize my learnings, but will include examples as I remember them while coding Terrarium. Sorry this is a bit rambling!

Avoid calls to expensive functions

The compiler is pretty smart. If you don’t call a function anywhere then it won’t be included in the compiled output. So think hard when you call a potentially expensive function, especially when an alternative exists.

In my case, I was using makeColorHSB as I rendered face colors. HSB is a nice color space to work in, but makeColorHSB is an expensive function both computationally and code size. I eventually switched back to makeColorRGB, which is much more storage efficient.

Reduce corner cases

If you can design/structure some system to avoid a corner case, you will have saved an if/else statement, which translates into storage savings. Even if you can’t avoid the corner case entirely, as long as it’s a “don’t care” then you can avoid writing special case code to handle it.

In the original Terrarium, each tile relied on knowing the amount of water contained in each neighbor’s face. As water flows, it is passed from one tile to the next through communication messages. The source tile needs to know how much water the destination tile can accept so that it doesn’t send more than it can handle. Every time a face gained or lost water, it would transmit its current quantity to its neighbor. The neighbor would remember this and only send water if it could fit.

In the new code, each face instead tracks its own sense of being “full” or “not full”. It’s this full flag that gets transmitted to the neighbor, not the actual quantity.

Also, the max water a face can hold is not a strict limit. It is able to go above this amount if it is given extra water due to communications delay.

These modifications allowed me to remove corner case and range checking, reducing the code footprint in the new version.

This also brings up a related topic…

All comm packets are expendable

This is specific to my project, but is interesting. In the original code, packets sent between tiles were guaranteed to be received. Each face has a fix sized outgoing queue. If circumstances caused more packets to be queued than it could handle then an error code would be put on the face and the only way to clear it was to disconnect the two tiles and reconnect. As such, I was careful not to send too many packets at once.

The new code treats all packets as expendable. If a command is sent to the queue, but the queue is full, the command is simply dropped. The simulation is structured in such a way that the sender doesn’t care if it never gets there. It might result in odd behavior, but it won’t break things.

For instance, when passing water between tiles, if water is sent from one tile, but never received then so what? Those units of water would cease to simply exist, but it won’t break the simulation.

Another example. If a bug is flying within one tile and tries to transfer to a neighbor, a request packet is sent to that neighbor. If it’s okay then that neighbor sends an acknowledgement back and turns on its bug code. The original tile receives the ack, knows the bug was taken, and turns off its bug code.

Either one of those packets could fail to arrive. For the original transfer attempt, if it gets lost then the originating tile just keeps the bug as if the transfer was rejected. The more interesting case is if the acknowledgement packet never arrives. Then the source tile would never turn off its bug code, resulting in bug duplication ahhhh. I’ve never seen this in practice, but if it ever happens then no worries. It doesn’t break anything.

This eliminated all the error checking code around the transmit queues, as well as the rendering of the error conditions.

Trading code for data

If you have an abundance of data, but are running out of code space, then perhaps there is a way you could trade one for the other.

I first did this during an optimization pass on the water system in the original version. Before, the algorithm to flow water between faces and tiles was done purely in code. I knew there was a lot of repetitive code, though. Water flowing from face 0 to face 3 wasn’t that different than from face 3 to the neighbor tile below, or even sideways from face 0 to face 1. I created a data format that let me parameterize these differences, then made a simple loop that iterated through the table to do its magic. The table is waterFlowSequence[] if you’re curious.

I applied this same optimization when I iterated on the plant code in the current version. I created a table that defines a plant’s state at any given point during its growth. While this did save code space, I found myself quickly running out of data storage! Each entry in the table took four bytes, which consumed far too much data as I tried to add new plant types. I iterated a ton on the data format as I shaved bits out of it and eventually got it down to two bytes per entry. It’s still pretty tight, but I have room for one more plant type!

Butcher blinklib

Getting desperate? Dive into the blinklib source and see what you can excise. For instance:

The dim function is expensive. Even if your own code doesn’t use it, the blinklib itself does. Rather than remove it entirely, I changed it to be essentially a nop. This saves > 100 bytes. It does sacrifice the nice fade-out/fade-in as tiles go to sleep and wake up, but these are desperate times!

 Color dim( Color color, byte brightness) {
#if 1
    return color;
#else
    return MAKECOLOR_5BIT_RGB(
    (GET_5BIT_R(color)*brightness)/255,
    (GET_5BIT_G(color)*brightness)/255,
    (GET_5BIT_B(color)*brightness)/255
    );
#endif
}

The setColorOnFace function passes in a Color struct by value. Every time it is called, the compiler has to pack that struct into the function call. Instead, I created setColorOnFace2 that accepts the Color as a pointer. This only saved ~30-40 bytes for me, but every bit helps.

Sometimes better, sometimes worse

I have macros that give me a face relative to another face, rotated CW or CCW. Also for giving the opposite face. I have tried several different algorithms. Some perform better in certain situations, but worse in others. I haven’t come across a perfect solution.

#define CCW_FROM_FACE(f, amt) (((f) - (amt)) + (((f) >= (amt)) ? 0 : 6))

#define CW_FROM_FACE(f, amt) (((f) + (amt)) % FACE_COUNT)

#define OPPOSITE_FACE(f) (((f) < 3) ? ((f) + 3) : ((f) - 3))

It is these three that I would be tempted to dive into inline assembly to try to get something super compact.

Freddicus · September 4, 2020, 4:52am

Thanks for the writeup! I was looking at this:

I don’t know about how the compiler handles ternaries versus static arrays / array access with regards to memory efficiency, but for opposite face I just have this:

const static byte oppositeFaces[] = {3, 4, 5, 0, 1, 2};

So the face opposite 0 is 3, 1 is 4, and so on. Usage would just be:

byte rearFace = oppositeFaces[headFace];

Thoughts?

BGA · September 4, 2020, 4:57am

This is a great approach specially if you have the memory to spare. My position was always that memory (even the 300 bytes or so that Blinks have free) is easy to manage. A lot more than storage.

That being said, I did not look but you should double check what it compiles to. You might be surprised.

Freddicus · September 4, 2020, 5:12am

I have 7 calls (so far) to determine the opposite face. I don’t know what the correct way to compare usage is, so I’ll just replace one with the other and paste what the compiler spits out.

Approach #1 (array access, ternary not defined):

Sketch uses 4956 bytes (84%) of program storage space. Maximum is 5888 bytes.
Global variables use 733 bytes (71%) of dynamic memory, leaving 291 bytes for local variables. Maximum is 1024 bytes.

Approach #2 (ternary, array not defined):

Sketch uses 4972 bytes (84%) of program storage space. Maximum is 5888 bytes.
Global variables use 733 bytes (71%) of dynamic memory, leaving 291 bytes for local variables. Maximum is 1024 bytes.

I’m not sure this proves anything without more testing / data, which tracks to your section heading above “Sometimes better, sometimes worse”

ScarPixel · September 4, 2020, 4:26pm

I tried that data-centric method for the CW and CCW algorithms a while back and it wasn’t any better. This confused me at the time. Maybe I need to revisit.

BGA · September 4, 2020, 4:48pm

The biggest advantage of a platform like Blinks concerning development is that the code we create tends to be relatively straight-forward so looking at the instructions the compiler spits out is “easy” (for some definition of easy). That is a powerful tool.

ScarPixel · September 6, 2020, 1:08am

I finally figured out how to view the assembly output from a compiled sketch. I don’t see a topic about this so I’ll write one up soon.

Unfortunately, because the sketch is also compiled with -Os (to optimize for space) the resulting assembly code isn’t directly linear relative to the C source and thus pretty hard to follow.

However, I did experiment with @Freddicus’s suggestion to use the data-oriented approach for the CW, CCW, and OPPOSITE macros. So far they seem better! Saved about 80 bytes of code space, which is awesome. I haven’t actually run it to see if I messed anything up, but here’s what I have anyway:

// WARNING UNTESTED!
byte faceOffsetArray = { 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5 };
#define CW_FROM_FACE(f, amt) faceOffsetArray[(f) + (amt)]
#define CCW_FROM_FACE(f, amt) faceOffsetArray[6 + (f) - (amt)]
#define OPPOSITE_FACE(f) CW_FROM_FACE((f), 3)

This works as long as f is 0-5 and amt is 0-6, which should be fine for most uses.

(Gotta make sure I can shave 12 bytes somewhere to get this to fit )

ScarPixel · September 8, 2020, 4:04am

Something else I just learned. If you use bit fields to compact structures, try to ensure no fields cross a byte boundary. That will generate more complicated code as the compiler unpacks it.

I added a dummy bit to my plant node data structure so that exitFace would be byte aligned and it shaved a dozen-ish bytes off my code usage. The savings is more dramatic the more often the field is used.

struct PlantStateNode
{
  // Energy and state progression
  byte growState1         : 2;  // state offset when growing (1)
  byte growState2         : 2;  // state offset when growing (2) [same, but also added to growState1]
  
  // Flags
  byte isWitherState      : 1;  // state plant will jump back to when withering
  byte waitForGrowth      : 1;  // flag to only proceed once a child branch has grown
  byte advanceStage       : 1;  // plant advances to next stage in child
  byte UNUSED             : 1;  // BUFFER TO BYTE ALIGN THE NEXT FIELD

  // Growth into neighbors
  PlantExitFace exitFace  : 2;  // none, one across, fork

  // Render info
  byte faceRenderIndex    : 5;
};

ScarPixel · September 9, 2020, 5:31pm

Just checked in an updated version with a fourth plant type. I moved it to a branch so I can continue mainline dev work on master without affecting it.

This is the branch:

You’ll need to install @BGA’s custom blinklib to compile it. Follow the instructions in his thread. Be sure to use the latest and not the one given in the initial post.

My planned follow-on/sequel was going to be “Terrarium: Fauna” and remove most of the plants to give me code space to add new types of critters.

After switching to @BGA’s custom blinklib I now have enough code space to try to fit both Flora and Fauna in one sketch. Let’s see how this goes…

BGA · September 9, 2020, 6:05pm

Glad I was able to help. I noticed that you are not using datagrams. I have a change that allows you to fully disable datagrams and it gives you another 100 bytes or so. I also was able to shave another 50 bytes or so in my current development version (although I am pretty sure I am very close to the limit of what I can get without removing features. Maybe I’ve got to this limit already). I will try to push an update later today.

Othar than that, I was also able to shave another 150 bytes or so in Hexxagon itself so you might want to check my commits there to see what I did. It might also help your game.

Dudepants · September 10, 2020, 10:12pm

just shaved off 100 bytes. I was using % a lot…from within the setColorOnFace method, so those lines were crazy expensive.

ScarPixel · September 14, 2020, 4:54pm

I posted about an odd behavior with an optimization, but deleted it because it turned out to be wrong.

Basically I was trying to decrement a value pointed to by a (byte*) like this:

*value--;

But that actually decrements the pointer itself, not the value it is pointing to.

BGA · September 14, 2020, 5:07pm

ScarPixel:

I posted about an odd behavior with an optimization, but deleted it because it turned out to be wrong.

Basically I was trying to decrement a value pointed to by a (byte*) like this:

Copy to clipboard
*value--;
But that actually decrements the pointer itself, not the value it is pointing to.

(*value)++ should do the trick. You hit and operator precedence issue.

ScarPixel · September 16, 2020, 3:25am

Just had weird behavior while optimizing. I have a variable that gets set to a magic number of 7 within a function. This value is a #define so the compiler knows exactly what it is at compile time.

Changing that constant to another value, like 6 or 8, reduced the compiled size by 30 bytes

Every other value I tried from 0-255 is fine. It’s only 7 that adds 30.

Also, I have other functions that set other variables to that same #define. They have a similar bloat, but on a smaller scale. Making the value 7 only adds 2-4 bytes everywhere else. It’s just that one function that adds 30.

BGA · September 16, 2020, 4:45am

Oh, I see this kind of stuff all the time now. There are several comments on my code explaining why some things are the way they are when, logically, they did not have to be. That being said, checking the generated code might shed some light.