Tuesday, May 12, 2020

Ring Buffers


In my earlier post about redirecting printf() to a UART, I mentioned that the implementation was 'blocking'. This means that the calling code has to wait for all the data to be printed out the serial port before it can continue. If, for example, you are sending out a modest string of 20 characters at 115,200 baud, your code will be sitting idle for around 1.7ms, and at 100MHz clock speed, that translates to around 173,000 (single-cycle) instructions that could have done something productive in the meantime.
What you really want is to hand the character processing off to a background routine so your mainline code can execute as quickly as possible. One very effective way to do this is to implement a ring buffer.
Imagine an array of 16 characters and two indexes:

#define TX_BUFF_SZ 16
char TxBuffer[TX_BUFF_SZ];
int InsertIDX = 0;
int ExtractIDX = 0;

When we want to write data into the buffer, we use the InsertIDX to write into the next available slot, and then increment the index;

TxBuffer[InsertIDX++] = write_ch;

And when InsertIDX gets to the end of the buffer, we wrap it back to zero as follows:

If (InsertIDX >= TX_BUFF_SZ)
InsertIDX = 0;

This code works in the generic case, but if you initially set your buffer size to a power of 2, you can get a little fancy by logically AND'ing InsertIDX to mask off the high-order bits.

InsertIDX &= (TX_BUFF_SZ - 1);

Putting it all together, we can create a generic ring-buffer write function that also enables a transmitter empty interrupt as follows:

void rb_putchar(char ch)
{
   TxBuffer[InsertIDX++] = ch;
   InsertIDX &= (TX_BUFF_SZ - 1);
   TxEmptyInterruptEnable = 1;    //Enable the TxEmpty interrupt
}

So long as TxBuffer is at least as large as the largest amount of data we might want to send out in a single printf() call, the above code will allow us to buffer the data and return control back to the calling routine as quickly as possible - i.e. it is no longer 'blocking'.

Now let's deal with feeding each character one-by-one to the USART. This would be done inside an TxEmpty interrupt service routine (ISR). Observe that when ExtractIDX catches up to InsertIDX there is no more data in the ring buffer to send out so we need to disable further TxEmpty interrupts:

void ISR_TxEmpty(void)
{
   UartTX = TxBuffer[ExtractIDX++];
   ExtractIDX &=(TX_BUFF_SZ - 1);
   if (ExtractIDX == InsertIDX)
      TxEmptyInterruptEnable = 0; //Disable further TX empty interrupts
}

Caveats

To help keep the focus on the core concepts, I've kept the code above very simply, but in practice, you'll need to qualify your declarations of InsertIDX and ExtractIDX as volatile so that the compiler recognizes these variables can be updated outside of the normal function execution path.

Monday, May 11, 2020

stdio - to buffer or not to buffer?


In my previous post, I showed how stdio could be redirected to send printf() output to an available USART. It was a basic implementation designed to get things moving, but you may encounter a little quirk when sending strings that don't end with a newline '\n' character.
By default, stdio buffers (holds onto) outgoing data until it sees a newline character; at which point it sends the entire line (to the serial port). The benefit of this is that it reduces the amount of low level interactions you need to have with the USART driver code, but the disadvantage is that you have to ensure each printf() call ends with a newline if you want the string to be immediately presented to the COM port.
I'll leave it up to you as to which implementation is best for your situation, but in case you decide to eliminate the input and output buffering, here's how to do it in C.
Somewhere in your initialization code, add the following lines:
setbuf(stdout, NULL);
setbuf(stderr, NULL);
setvbuf(stdin, NULL, _IONBF, 0);
The first line will remove buffering from stdout (which is where printf() goes by default)
The second line will remove buffering from stderr (if you are using that for directing error information)
The last line will remove buffering from stdin - which is the default for where scanf() will look for input.

The juicy details can be found here:
https://www.gnu.org/software/libc/manual/html_node/Controlling-Buffering.html


Tuesday, May 5, 2020

Implementing printf() on a UART

When developing on a new embedded (bare metal) platform, one of the first things I look for (after getting an LED to flash) is the ability to output text to a display. Usually the simplest path is to commandeer one of the UARTs and to start bashing characters out the TX pin and receiving them at the host end through a PuTTY terminal. So how to implement this efficiently?

The first question you'll need to resolve is how to get characters from your code into the UART. Hopefully the frawework you are using provides a simple setup process and wrapper to allow you to initialize the UART and send a single character. In the STM32CubeIDE environment, the UART properties can be setup fairly simply:

115200 is a pretty safe baud rate to start with. I've also shown the corresponding PuTTY settings too. Once you've proved that basic communications are working, you will probably be able to set the baud rate much higher; although take care when doing this. Even though the STM32 might be able to do 8Mbaud, FTDI-based USB Serial devices will top out at 3Mbaud: (https://www.ftdichip.com/Support/Knowledgebase/index.html?whatbaudratesareachieveabl.htm)
In any case, let's assume you've correctly setup the comms basics (don't forget to make sure the parity and stop bits match). Next job is to send something simple.

Once again, using the STM32CubeIDE environment as an example, ST offers a simple function call as part of the HAL (Hardware Abstraction Layer) interface:
HAL_StatusTypeDef HAL_UART_Transmit(UART_HandleTypeDef *huart, uint8_t *pData, uint16_t Size, uint32_t Timeout)
Assuming STM32CubeIDE has instantiated the UART handle as "huart1", you can make a simple call from your code like this:
char msg[] = "Hello World";
hres = HAL_UART_Transmit(huart1, msg, strlen(msg), 100);

Tying into printf()

Sending out "Hello World" might be initially very gratifying, but eventually you'll need to send out more advanced information such as variable values. One tempting option is to jump in and write your own MyPrintf() function, but that would involve a whole lot of re-inventing the wheel. A much better option is to tie into the existing functionality that already exists. Fortunately, the authors of most embedded frameworks understand that developers will want to redirect their output through various peripherals so they provide the infrastructure to readily support it.
If you dig deep enough, you'll find that after several layers of code, the standard printf() function distills down to one or two low-level functions. Take a look in the syscalls.c file for the _write() function:
__attribute__((weak)) int _write(int file, char *ptr, int len)
{
int DataIdx;
for (DataIdx = 0; DataIdx < len; DataIdx++)
{
__io_putchar(*ptr++);
}
return len;
}
Take note that this function is prefaced by __attribute__((weak)). This is a message to the compiler to look elsewhere in the code for a _write() function and use that one first, but if it can't find one then use this one. In effect, it allows you to implement your own _write() function within your codebase without needing to modify the library source files. A UART-based implementation would therefore be:
int _write(int file, char *ptr, int len)
{
HAL_StatusTypeDef hres;
hres = HAL_UART_Transmit(huart1, (uint8_t*)ptr, len, TX_BLOCKING_TIMEOUT);
if (hres == HAL_OK)
return len;
else
return -1;
}
With this function now in your code, you can access all of the wonderful formatting capabilities that come with printf() and your output will be magically directed to the serial port.

Concluding Remarks

The implementation above gets you up and running with basic character output capabilities but there's a hidden problem you may face further down the track. This implementation is 'blocking'; which means that it waits until all characters have been successfully piped out the serial port before it returns control back to the calling function. In timing critical applications or when sending large volumes of data, this could have an impact on the performance of your system, so in another blog entry, I'll talk about how to implement a ring-buffer and interrupts so your code doesn't have to wait around for the UART to finish its thing.


Wednesday, April 29, 2020

fopen() - Text or Binary?

Its often convenient to control an embedded system remotely via some sort of communications protocol. In the good ole days, I'd devise some sort of homebrew communications protocol that was specific to my application's needs and I'd bash some data in and out of the serial port. Having grown up on 8-bit micros with limited RAM, every byte was valuable so there was a strong inclination to pack things tightly into binary streams, but the problem with that is how to test it... and before you know it I'm being drag into an endless rabbit hole of development on the PC side just so I could conveniently communicate with the embedded app.
Embedded processors have come a long way and right now, I'm getting up close and personal with some pretty nifty STM32 processor boards - this one in particular that I sourced from Banggood: https://sea.banggood.com/STM32F407VET6-Development-Board-Cortex-M4-STM32-Small-System-ARM-Learning-Core-Module-p-1460490.html

Compared with an 8051, these ARM devices are a beast; and the development boards are so cheap! It's time to cast off the constraints of decades past and update my comms routines... enter JSON.
JSON is sooo simple even I can wrap my head around it; it's text based which means testing it with my embedded app is simply a case of sending text files through a PuTTY terminal. Furthermore, because JSON is used so widely across the winternet, I'm pretty confident I'll be able to keep using it even if I upgrade my comms to Etherweb.
I dug around and quickly settled on a relatively compact JSON interpreter (called JSMN) from Serge Zaitsev (kudos @zsergo): https://zserge.com/jsmn/. It comes packaged as a single C header file which feels a little hacky but its nonetheless effective and I was able to get a basic command interpreter up and running on my target board pretty quickly.
At that point, I started thinking more about what sort of commands I might want to exchange and it became pretty clear that I was going to need some sort of generic command interpreter that was easy to abstract and extend.  That's going to be a work in progress but the relevance to today's discussion is how it drove me back to establishing a parallel development platform that would allow me to develop and test code on my PC rather than needing to run everything from my target (ARM) platform, and how I bumped into a nasty little PITA on how Windows processes text files.
I'm not going to post my original code because some dufuss will blindly cut and past it into their application and wonder why it doesn't work.. instead I'll post the working code:

 ============================================================================
 Name        : main.c
 Author      : Marty Hauff
 Version     :
 Copyright   : Use at own risk
 Description : Test scaffold for reading a file and processing...
 ============================================================================
 */

#include <stdio.h>
#include <stdlib.h>
#include <limits.h> //PATH_MAX
#include <unistd.h> //getcwd()
#include <sys/stat.h> //struct stat
#include <string.h> //strlen()

#define MAX_TOKENS 256

int main(void) {
FILE* fp;
char* Filename;
struct stat st;
char cwd[PATH_MAX];
char* json_string = {0};

printf ("\nJSON Test\n");
getcwd(cwd, sizeof(cwd));
if (cwd != NULL) {
printf ("Current working dir: %s", cwd);
} else {
perror("getcwd() error");
return EXIT_FAILURE;
}

Filename = "JSON Examples/JSON_2.json";
fp = fopen(Filename, "rb"); //MUST use binary mode otherwise \r\n sequences get messed up!!
if (fp == NULL)
{
printf ("\nFailed to open %s", Filename);
return EXIT_FAILURE;
}

printf ("\n\"%s\" opened successfully", Filename);
stat(Filename, &amp;st);
printf ("\nStat Size: %ld", st.st_size);
fseek(fp, 0, SEEK_END);
printf ("\nfseek Size: %ld", ftell(fp));
fseek(fp, 0, SEEK_SET);

json_string = malloc(st.st_size+1);
fread(json_string, st.st_size, 1, fp);
fclose(fp);

json_string[st.st_size] = '\0';
printf ("\nstrlen Size: %d", strlen(json_string));
printf ("\n%s",json_string);

jsmn_parser p;
jsmntok_t tkns[MAX_TOKENS];
int nodes = 0;

jsmn_init(&amp;p);
nodes = jsmn_parse(&amp;p, json_string, strlen(json_string), tkns, MAX_TOKENS);
printf ("\nFound %d nodes", nodes);

free(json_string);
return EXIT_SUCCESS;
}

The thing that got me stuck for a day was that pesky little 'b' character in the fopen command. Here's what the output looked like without it (I'm using some test JSON from: https://json.org/example.html):

JSON Test
Current working dir: C:\Users\user\Documents\Projects\WinTest
"JSON Examples/JSON_2.json" opened successfully
Stat Size: 253
fseek Size: 253
strlen Size: 253
{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}
âãäåæçèéêëì
Found -2 nodes

I couldn't work out what was causing the garbage at the end of the stream (second last line) and why jsmn was subsequently failing to parse the file. The clue is that there are the same number of garbage characters as there are lines in the JSON string. Take a look at a hex dump of the JSON file:
Take note of the 0D 0A sequence. Also take a look at these settings in Notepad++

Without the 'b' in the fopen() call, the file is opened as a text file and Windows strips out all the CR (0x0D '\r') characters when its reading the contents into a buffer. As a result, the string stored in memory is actually shorter than the number of characters read from disk.
Now look at the output when the 'b' is included in the fopen() call (i.e. as per the code listing above):
JSON Test
Current working dir: C:\Users\user\Documents\Projects\WinTest
"JSON Examples/JSON_2.json" opened successfully
Stat Size: 253
fseek Size: 253
strlen Size: 253
{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}
Found 26 nodes

The CR character remains in the text stream so the console output (on Eclipse) double spaces each line, but take note that we no longer get the garbage at the end of the stream; AND jsmn has managed to parse the file correctly.

Take note of the moral of the story - consider reading a file in binary mode even if you know the contents are text.

Now to work out how to build a generic JSON command interpreter.

Return from hiatus

After almost 10 years since my last blog, I've decided it's about time I started writing up a few things again. It's been an eventful 10 years... not the least of which was living in China for almost 8 of them. The great firewall of China blocks this blogger site so it was increasingly difficult to make entries and, for the most part, I was so busy doing life that finding time to update this blog just became one of those things that fell off the priority list.
In any case, kids are now living their own independent lives back in Australia and a year ago wifey and I moved to Singapore and I've got a lot more control over my time.  And with this Covid-19 shenanigans I've been recently put out of (working for the man) work and so I've got plenty of time to pursue some projects of my own. Over the past couple of years, I've also bought some reasonably good test equpment (DSO, PSup, Sig Gen & Desktop Multimeter) so I'm pretty well equipped to do development at home. One of these days I'll do a writeup of my lab setup.
For now, this is a quick hello to the world and I hope to be keeping this site much better fed.

Friday, April 15, 2011

Using Competitors as Suppliers

Not that long ago, a PCB designer posted an interesting question on one of the LinkedIn discussion groups. The question was this:

"How do you feel about using a fabricator that also offers design services?"

I responded to the discussion directly but for the benefit of those who may not be members of the group, I wanted to share my thoughts more broadly because it shows how building connected devices can actually help defend against suppliers who might also be competitors.

It's a really interesting question that you are posing but I think it points to a bigger issue.
I use a mechanic in my local area to service my car. The service he offers (in terms of tuning my car up and changing the oil) can not be differentiated from any other mechanic I might choose to use, but over time, I have become loyal to this one mechanic because the service he offers goes beyond the oil he puts in my car. Sitting at the back of our transactions is a relationship and the history of that relationship is not something that a competitor could replace by simply undercutting the price.
So when it comes to creating an electronics product, I think designers really need to ask themselves whether the sole basis of their business is built on the secret sauce they add to their boards in the form of IP, or does it extend to an ongoing relationship.
I know the Apple example is overused but it really highlights the point. They don't make the cheapest MP3 players, but they have changed the game so that it's actually not an MP3 player that I'm buying from Apple; its a relationship and ability to seamlessly connect into their eco-system of content. In effect, the player is almost ancillary to my real requirement which is to have good quality music / entertainment on the run.
So the best defence against would-be IP thieves is to build products that develop an ongoing relationship and provide a reason for customers to remain connected to you... and then service those customers as if your life depended on it (because it does)!

Wednesday, April 13, 2011

China, here we come

It seems ironic that my last post dealt with the issues that foreigners have in grasping command of the English language. And now, it seems, I am to become a foreigner too.
I haven't been on the air much over the past 6 months because I've been backwards and forwards to China several times while working pretty solidly on the launch of a brand new training facility at our office in Shanghai. I'm thrilled with what has come together around that project and the fact that this week marks the launch of our first public course.
While "paid for" training might be a relatively new thing in China, the appetite is very high for quality courses that lift designers to a new level of skill and provide them with a deeper understanding of the methodology driving the creation of Altium's solutions. Our new facility will allow us to help customers better than ever before while also feeding the huge demand for skilled electronics designers as China continues on its massive growth surge.

In other news, you no doubt would have read the news that Altium is relocating its headquarters to Shanghai, China. This has been met with a huge range of responses from customers and industry commentators. Some have suggested that it makes perfect logical sense given the huge investment being made in China in technology areas that are in absolute alignment with Altium's vision to create a "copper to cloud" design tool. But others have been slightly less rational. At the extreme end they've insinuated that Altium's move can only mean that we have lost our way and have been hijacked by Communist antagonists who are planning on overthrowing Western military installations by using Altium Designer as a back door window into the inner sanctums of top-secret design houses.
In all honesty, I've got little time for xenophobic tirades but I do understand the depth of emotion that news such as this can evoke. It is very hard for designers in the West to not feel threatened by Altium's decision. Western designers have been the fortunate benefactors of over a century of manufacturing fueled growth that has led to great relative prosperity. But Altium's decision to locate its headquarters in Shanghai rather than Silicon Valley makes a very strong statement about where it sees the next wave of prosperity coming from. And that statement challenges several assumptions that many in the West have become accustomed to making. But on reflection, do we really believe that the West has some sort of monopoly on innovative design and quality? Do we really believe that our political system somehow gives us the absolute right to create better products?
Now before anyone starts sending volleys of political abuse at me, please take a moment to consider what I'm saying. The way I see it is this: when 1.4 billion people start becoming upwardly mobile, you can choose to stand at the shore and yell at the encroaching tide. Or you can jump right in and ride the wave of your life.
I, for one, am a surfer and my family and I are currently preparing for a move to China.

Tuesday, September 21, 2010

Thank goodness English is hard to master

I took a moment the other day to clean out my email spam folder and was amazed again at some of the crap that is out there and circulating. Frankly I pity the mums and dads, grandmas and grandpas who are still getting a handle on internet communications and are forced to wade through the mountains of spam content that spews from lowlife individuals trying to gain a dishonest buck.
No I don't want $US1,000,000 dollars deposited into my bank account from the estate of some poor soul who's only crime was to die in an African country that I've never heard of before and whose name I can't pronounce. I don't want to increase the size of my ... (and frankly I'm offended at the suggestion that I need any further augmentation), and I'm not in any way your acquaintance so don't address me as "Dear Friend".
But as good as my spam filter is, and as thankful as I am for it performing the job of raking through the rubbish that skates across the internet, I'm actually most thankful that English is a difficult language to master. It is that fact that provides us with the single biggest identifier of spam emails and allows us to differentiate them from their legitimate counterparts. Take this latest email as an example:

Dear Friend
I am Mr Ailudiko Razak working with Islamic Development Bank(ISDB)Ouagadougou Burkina Faso. I want to inquire from you if you can handle this transaction for mutual benefits/life opportunity for you and me.The transaction is about seeking your consent to present you as the next of kin/ beneficiary To our late customer over his fund US$25,Million dollars.
He died with his family during their vacation journey. I am waiting for your response for more details. The fund is going to be share at the ratio of 60/30.30% for you and 60% for i and my family which we are going to use for investment.and 10% for outstanding expenses.
Mr Ailudiko Razak


What self-respecting bank would ever communicate using such a poor command of the English language? Even if my spam filter had allowed this one to slip through the cracks, I'd have every opportunity to detect its stench simply from the malformed sentence structures and incorrect use of words.
So while the English language is the bastard child of centuries of conquerors arriving on the shores of the UK, it is now the greatest asset I have to protect me against cyber criminals.

Thursday, September 16, 2010

Altium Morfik acquisition

At last I can finally share with you some commentary about what has been rattling around the corridors here at Altium HQ for some time. We've just formally announced our intention to acquire Morfik and it's very exciting.

Picture this. You're an electronics designer and you've got a great idea for a new gadget. You've got all the skills necessary to design a PCB with some smarts in it and you can even program those smarts yourself. But then you come to adding some connectivity to the internet. You add an ethernet interface, update your smarts, and now you're ready to do the cloud stuff .... and you hit a brick wall. All that stuff about PHP and SQL and internet servers and SOAP and HTML and XML and Java and Ajax and ... It's a whole 'nother world!

So what do you do? As an electronics designer, you know about bits and bytes and if you were ever pushed on the point, you could probably even design the hardware for a web server. But when it comes to writing applications that exist in the cloud, where do you start?

Put simply, the Altium Morfik acquisition is all about giving you that starting point right out of the box. The philosophy is that pretty soon, every little device will need to be somehow connected to the cloud to maintain its relevance and appeal. And when it comes to designing those little neddies, you've got to start thinking about how and what you're going to connect it to. How will you pass data between your device and the cloud? Will it be via email posts, a simple Web server running inside the device, or will it be some other technique?

The cool thing about what Altium is up to is that pretty soon you won't need to worry about the implementation specifics of all that sort of stuff. Using Morfik's technology (which lets you write applications on a PC and deploy them into the cloud), and Altium's unified design strategy, you'll be able to co-develop new devices AND the cloud-based eco-systems that they plug into. So adding cloud connectivity and applications will be just as accessible to you as an electronics designer as it is to all those geeky CS dudes ;)

Hopefully it won't be long before I'll get to show you how this stuff works in practice with some real demos, but for the time being, I suggest you take a look at the videos on Morfik's website. We'll be adding more and more of this stuff under the Altium banner over time but take it from me, this is a state changer.

If you thought that Altium was a little out there as an EDA company, now we're off in the cloud!

Thursday, August 26, 2010

Who cares about inefficient code

At the risk of stirring up a hornets nest, I'm getting really tired of the naysayers who quickly play the "it's not as efficient as hand-crafted code" card when a new, high-level programming or design technique comes along. They just don't seem to get it that the question of 'efficiency' extends well beyond the run time of the application. In the real world of commercial pressures, making successful products is not just about the performance of the end product. It's also about your ability to develop, deploy and maintain that product within the window of opportunity given to you by the market.
This little rant of mine was provoked again after I read a recent FPGA Journal article 'Drag and Drop vs HDL?' by Dick Selwood. It was a well-written and informative article about National Instrument's continued push into FPGA design with their drag and drop design environment. But all three reader comments (as of today) focus around the efficiency of the code produced by GUI based design approaches. Now come on guys. Surely you can try a bit harder. Of course hand crafted code is going to be more efficient than GUI-based stuff. But that only matters when it matters! To write off the whole GUI-based approach on account of the few situations when it isn't suitable is way to short-sighted. And it's not like NI is taking away the ability to use hand-crafted code. They are simply giving you the choice.
The real problem with the "it's not as efficient as hand crafted code" argument is that although it sounds rational, it places us in a very dangerous position of being dismissive of the whole technique without giving further thought to whether that technique will be disruptive. If you ever get the chance to read "The Innovator's Dilemma" by Clayton M. Christensen then I highly recommend you do. It gives some really strong reasons why it can be suicidal to ignore new technologies on the basis of how well they fit current market demands. Technology doesn't stand still. It continues to push on at a break neck pace. If we dismiss a technology today because it appears to be inefficient compared with established techniques, we run the risk of being blind-sided when technological advances suddenly make the inefficiencies irrelevant. By that time, it is too late to reposition ourselves around the new way of doing things. As the book puts it, the real question is not about efficiency, it is about how disruptive the 'new thing' will be.
So here are my rules:
1) If something looks slow, technology will make it fast.
2) If a new design technique raises the abstraction level, gets you to market faster, or allows broader access to growth technologies (i.e. disruptive), it will supplant other techniques.
3) There will always be a need for hand-crafted solutions. But the proportion of products that must be hand-crafted will decrease.

Monday, August 23, 2010

The difference between Vias and Pads

A recent post on one of Altium's forums related to the fundamental difference between pads and vias. So because I thought it was an interesting question, I figured it was worth posting a blog entry about it.

Pads are the connection point between copper on the PCB and leads on the component. They are a common part of component footprints and they can be through hole or surface mount. When they are through hole, they are virtually always drilled completely through the board. When they are surface mount, they only exist on either the top or bottom layer. Pads can have virtually any arbitrary shape however round, rectangular, and rounded rectangular are most common.

Vias, on the other hand, are the means by which an electrical connection is routed between two layers. As such, they are part of the track net and not usually tied to the component footprint. They are always drilled and always round. The depth of a via can vary depending on whether it needs to pass between the outside layers of the PCB, an outside and inside layer (blind), or two inside layers (buried).

So the big question is, "Are these two primitives similar enough to be merged as one?"
The short answer is No. And the reasons are multiple:
1) Pads make their connection to component leads by being soldered. This means that the properties of the pad must have consideration for the soldering process being used. Solder mask pullback, pad surfacing, size and shape are all dictated by the soldering process and the physical properties of the component lead being connected.
2) Pads need to support multiple sizes and shapes due to both the properties of the leads of components connected to them, and any heatsinking effects needed as part of the component's cooling.
3) The pads of low pin count surface mount components need to be thermally balanced to avoid any ill effects caused by different cooling rates. For instance, a pad on one end of a two lead component (such as a resistor) that cools faster then the pad at the other end can cause the component to stand on its head (tombstone) as the solder contracts.
4) Unless you are dealing with embedded passives (i.e. components that are placed within the layers of the PCB), it makes little sense to have buried pads. Blind pads could be argued for some leaded components but that would make the board very dependant on very accurate lead lengths being maintained by the component vendor. This is probably an unnecessary risk as component leads that are too long will cause the components to stand off from the PCB. In some instances this may be desirable but I suspect it would be more hassle than it's worth.
5) Pads must have a designation to indicate how component pins and pads must be aligned.
6) The primary conduction path of a via is through the hole barrel. The copper donut area on top and bottom of the via is simply there to provide a solid connection between the hole barrel and the top / bottom connecting tracks. Without this, the connecting track could be torn off when the via barrel is drilled during manufacture.
7) Because vias don't require solder to fulfill their purpose, vias can be tented (covered with solder mask) to ensure that no copper is exposed to the outside world. This limits the risk of oxidisation of the via copper.
8) Vias offer a path between layers and so it is meaningless to use anything other than round, donut-shaped entry/exit points.
9) Vias need not have any designation since they are unreleated to components. However some form of unique ID would be helpful when devising design rules intended only for specific vias (of specific nets).

So in summary, Vias and Pads might appear similar but their functions are quite separate. And in my view, they should remain as separate primitives. Comments?