The way I do it is as follows:
1. Tileset bitmaps can be named in any way you like.
2. They can be any size that works for most graphic cards, I usually stick with 256x(whatever).
When the engine loads the map, the map have a lookup table in the header of tilesets used in the map and an assigned number from 1 - 127. The engine loads the tilesets required, parses out the tiles needed (based on the map), and then unloads the tileset from memory (keeping the tiles). This use the least amount of video ram possible.
Each "tile" structure contains a pointer to the sprite structure. So when rendering a map, just pass the pointer to directx, and it'll render

Anyway, that's the store and loaded part .. I guess that's all your looking for?
I just encrypt and compress my tilesets. I can decompress and unencrypt them in memory and pass off a pointer to that memory address to load create the surface. Could also just keep it in memory and parse out the tiles into their own little surfaces, that way, you will only have the tiles you absolutely need in memory ...
If you have tons of tilesets, it is usually always a bad idea to load them into memory (especially if you have 50+ MB of graphics). Only load the ones you need. ... ala Graal, Astonia, Ashen Empires, Shadowbane

, and others. Not everybody have video cards with mundo memory
Another option is to load all graphics in system memory, then push over surfaces to video memory when you need them. Like during the loading of a map. Then you can cycle out tiles you dont need when a new map is loaded.
Btw: non squared tiles are supported on newest NVidia, and ATI drivers for all their cards. So you can do 3D sprites of 32x64 or whatever. But we're only talking about the basics right now, eh?