Notes on Software Development

Contains notes and lessons in working technology, especially .NET, Azure, DevOps, Agile, and Team Foundation Server.

Wednesday, August 22, 2018

Cosmos Change Feed and Lease Collection Use

The change feed functionality in Azure Cosmos DB is a great feature with a wide variety of uses. The change feed subscribes to changes that occur in a collection and keeps its state in another collection, called the Lease Collection. Because the minimum RUs in a collection are 400 and a single lease hardly uses 50 of the RUs, we had decided to store additional meta data documents in the same lease collection, because we could not use the Continuation Token and Sequence Number data to determine where in the stream the subscriber was (for our own monitoring purposes). Don’t do this! The change feed library will regularly scan and pull all documents in this collection and use it to balance the subscribers and therefore the more documents are in this collection, the greater the RU requirements. Our RUs were bumping against 4,000 when we finally realized what it was doing:

image

After removing our additional metadata documents, the lease collection RUs are much better (note that we have several subscribers using the same lease).

image

Now we store our meta data documents in another collection, which happily sits at around 100 RUs, while leases, as you can see above, hovers around the same.

Bottom line: don’t store anything custom in the Lease Collection of the change feed because it exponentially increases the RU requirements.

Wednesday, July 12, 2017

TFS Hosted Build Controller and .NET 4.7

If you want the TFS hosted build controller to run .NET 4.7 builds, make sure to change the default agent queue to the "Hosted VS2017" version. You can do this by editing the build definition and in the Tasks Process landing page, it is the second option, as shown below.


Sunday, June 4, 2017

Backups, Synology, and the Recycle Bin

I am a big fan of Synology products, owning their 5-bay DS1515+ NAS series (now 1517+) and the RT2600 router. The Synology software is fantastic, user-friendly, and allows additional packages to be installed, further extending the features of the product. Within the last year or so, they've greatly improved their cloud-syncing features, and the CloudSync package provides an easy way to sync files to/from Synology to cloud providers. They also have packages for syncing to other storage options (such as Amazon's Glacier storage).

I had recently configured my NAS drive to backup all local files to Amazon Drive, which offers unlimited storage for $59.99 a year (plus tax). Yes, unlimited! It's a great deal. If you don't have a cloud backup location for your files, Amazon's offering is worth looking into. And they have software that you can install across different platforms so you can sync local files and directories to the cloud.

I was cleaning up the folder structure today in the Synology File Station software and accidentally deleted a root folder (yikes!). I immediately caught the error, paused the sync, but it was too late; some of the files were deleted, both on Synology and on Amazon. Fortunately, Amazon Drive has a "Recycle Bin" so I was able to recover the files. However, this made me enable a feature I had assumed was turned on in Synology: Synology's Recycle Bin. You need to verify yours is turned on too.

Navigate to the Synology Control Panel, choose the Shared Folders icon, select the appropriate folder, choose Edit, and check the "Enable Recycle Bin" option. Now if you do something terrible like deleting an important folder, at least you won't have to wait for hours to pull it back down, if you are syncing to another location.

Two lessons learned:
1. Make sure you have more than one backup. Seriously, buy some space on Amazon or another cloud provider, set up a sync, and make sure it completes. It's important!
2. Make sure your folders have an "undelete" option available.

And just for grins, the files on my Synology that are longer term, unchanging backups, are going to have a third backup location on Amazon's Glacier storage, so that I am covered there. At $0.01/month/GB, it's a cheap option.




Friday, April 21, 2017

Dictatorial Management

Issues don't go away simply because you issue an edict or say that the issue will no longer happen.

One of my favorites is a problem my team runs into almost every day: a developer checks code into the system and breaks the deployment to our test environment. Management's "solution" is that developers shouldn't break the deployment so there's no point in educating the developers on how to troubleshoot and resolve deployment issues.

Now stop and re-read that last sentence. What!?

Because an issue shouldn't occur, there's no point in educating people on how to solve the issue when it does occur. This is a typical management solution.

Another common management solution is to threaten to track the number of issues per developer and have this count reflected on their next performance review. Thus far, I've never seen this done, as to do so would be quite onerous on the manager (and in reality, what does this actually solve?).

How much more successful would companies, teams, and people be if we would stop the nonsense of impossible solutions? Simply stating 'this issue will never happen again' or to threaten and cajole does nothing to actually solve a problem. Why don't we work in realities and actual possible solutions instead of the ridiculous power-insanity of those at the top or the simplistic manager solutions that do nothing to address the real issues.

Do you want to actually provide a real solution? How about empowering your employees with the mastery and autonomy to actually care about what they do and the quality with which they do it? What if they had some ownership in the process and the success of what they are doing? What if instead of dictating, you stepped aside and let the team choose? It might not be good for your ego, but it sure would solve a lot more issues than a top down approach.

Friday, November 20, 2015

Getting all changesets associated to work items

I had a need today to pull a list of all check-ins that had been associated to a certain list of work items. This can be done easily using the TFS API assemblies, most of which are located in C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\ReferenceAssemblies\v2.0. The code below will write the list of all file changes across all changesets for the specified work item IDs.
using (TextWriter tmp = Console.Out)
            {
                using (var fs = new FileStream("Test.txt", FileMode.Create))
                {
                    using (var sw = new StreamWriter(fs))
                    {
                        Console.SetOut(sw);

                        var workItemIds = new int[] { 1,2,3 };
                        var collectionUri = new Uri("http://yourtfs/server/");

                        try
                        {
                            using (var tpc = new TfsTeamProjectCollection(collectionUri))
                            {
                                var workItemStore = tpc.GetService<WorkItemStore>();
                                var teamProject = workItemStore.Projects["project-name"];
                                var versionControlServer = tpc.GetService<VersionControlServer>();
                                var artifactProvider = versionControlServer.ArtifactProvider;
                                var workItems = workItemStore.Query(workItemIds, "Select [System.Id], [System.Title] from WorkItems");
                                var allChangesets = new List<Changeset>();

                                foreach (WorkItem workItem in workItems)
                                {
                                    allChangesets.AddRange(
                                        workItem.Links.OfType<ExternalLink>().Select(link => artifactProvider.GetChangeset(new Uri(link.LinkedArtifactUri)))
                                        );
                                }

                                var orderedChangesets = allChangesets.OrderByDescending(c => c.CreationDate).ToArray();
                                foreach (var changeset in orderedChangesets)
                                {
                                    Console.WriteLine("{0} on {1:MM-dd-yyyy HH:mm} by {2} ({3} change(s))", changeset.ChangesetId, changeset.CreationDate, changeset.Owner, changeset.Changes.Length);
                                    foreach (var change in changeset.Changes)
                                    {
                                        Console.WriteLine("    [{0}] {1}", change.ChangeType, change.Item.ServerItem);
                                    }
                                    Console.WriteLine("-----");
                                    Console.WriteLine();
                                }
                            }
                        }
                        catch (Exception ex)
                        {
                            Console.WriteLine(ex.Message);
                        }

                        sw.Close();
                    }
                }
                Console.SetOut(tmp);
                Console.WriteLine("Press any key to exit...");
                Console.ReadKey();
            }

Wednesday, November 11, 2015

Branching

I had a conversation with a co-worker yesterday about when to create branches in TFS and the conversation reflected a confusion surrounding the use of branches. The coworker suggested that a new branch should be created each time code was being pushed to any environment because it’s the only way you can be completely sure the branch isn’t polluted.

To get the obvious out of the way, if a source control system “pollutes” a branch with no human intervention, get a new source control system. The system has failed at its most basic task. It is very likely this is not the case. It is probably how you are executing your branching and merging strategy.

Let’s take a typical branching structure: Dev > QA > Prod; Dev is the parent of QA, which is the parent of Prod. Changes are merged from Dev into QA, then from QA into Prod. You should never get merge conflicts when going from Dev to QA, or QA to Prod. Because of this, merging is clean and no “pollution” can happen.

How is this possible?

The only way a merge conflict happens is when a change has occurred in the target branch you are merging into which has not been integrated in the source branch. But - and this is the critical point - if you are following good merging practices, if a change must be made in QA or Prod, it is immediately merged into the parent branch(es). No exceptions! If I make a change in the QA branch, my next immediate check-in is a merge to the Dev branch from QA. I will resolve any merge conflicts with this merge, ensuring that my change is properly integrated in Dev. The next time Dev is merged into QA, it will already have this change, and so no merge conflicts will occur.

Wednesday, May 14, 2014

PowerShell Copy-Item Recursion

I ran into something interesting today when doing a recursive copy in PowerShell. The “*” character makes a huge difference when you are copying to an existing directory with files.

For example, if I have a d:\temp\files directory that I want to copy to d:\temp\files2, I can do so by doing:

copy-item d:\temp\files d:\temp\files2 –recurse –force

However, run that same line again after changing a file or two and though it appears to copy, it won’t actually copy the files. In order for this to happen, you must include a * at the end of the path:

copy-item d:\temp\files\* d:\temp\files2 –recurse –force